|
Name |
Accession |
Description |
Interval |
E-value |
| AvrBs3 |
NF041308 |
type III secretion system effector avirulence protein AvrBs3; |
258-1337 |
0e+00 |
|
type III secretion system effector avirulence protein AvrBs3;
Pssm-ID: 469205 [Multi-domain] Cd Length: 1179 Bit Score: 1771.03 E-value: 0e+00
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 258 IRPRRPSPARELLPGPQPDRVQPTADRGVSAPAGSPLDGLPARRTVSRTRLPSPPAPSPAFSAGSFSDLLRPFDPSLLDT 337
Cdd:NF041308 4 IRSRTPSPAREPQAGSQPDGVQPIAGRLVSTAASSPLDGLPARPAMSRTRQPATPAPSPAFSVGSFSDLLRQFDPSLFDP 83
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 338 SLLDSMPAVGTPHTAAAPAEWDEAQSALRAADDPPPTVRVAVTAARP--PRAKPAPRRRAAQPSDASPAAQVDLRTLGYS 415
Cdd:NF041308 84 SLFDSSPAFGAHHADAAPGEMDEVQSGLRAADDPQSHLSAAVTAPSPtpPRTQAAARRRSAQTSDASPAESVDLSTLGYT 163
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 416 QQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVTYQHIITALPEATHEDIVGVGKQWSGARALEAL 495
Cdd:NF041308 164 QQQQEQIKPNARSTVAQHHAALVGHGFTHAHIVELSKHAAALGTVADRYQAIIAVLPEATHKDIVEVGKQWSGARALQAL 243
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 496 LTDAGELRGPPLQLDTGQLVKIAKRGGVTAMEAVHASRNALTGAPLNLTPDQVVAIASNIGGKQALETVQRLLPVLCQD- 574
Cdd:NF041308 244 LMVAEELRGPPLQLDTGQLIKIAKRGGAPAVEAVHASRNALTGAPLHLTPHQVVAIASNNGGKQALETVQRLLPVLCQPp 323
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 575 HGLTPDQVVAIASNGGGKQALETVQRLLPVLCQ-DHGLTPDQVVAIASNGGGKQALETVQRLLPVLCQ-DHGLTPDQVVA 652
Cdd:NF041308 324 HGLTPEQVVAIASNDGGKQALETVQRLLPVLCQaEHGLTPDQVVAIASNIGGKPALETVQRLLPVLCQpPHGLTPDQVVA 403
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 653 IASNGGGKQALETVQRLLPVLCQD-HGLTPDQVVAIASHDGGKQALETVQRLLPVLCQDHGLTPDQVVAIASNNGGKQAL 731
Cdd:NF041308 404 IASNDGGKQALETVQRLLPVLCQApHGLTPDQVVAIASNDGGKQALETVQRLLPELCQAHGLTPDQVVAIASNGGGKQAL 483
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 732 ETVQRLLPVLCQ-DHGLTPDQVVAIASNIGGKQALETVQRLLPVLCQ-DHGLTPDQVVAIASHDGGKQALETVQRLLPVL 809
Cdd:NF041308 484 ETVQRLLPVLCQpPHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQpPHGLTPEQVVAIASHDGGKQALETVHRLLPVL 563
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 810 CQ-DHGLTPDQVVAIASNNGGKQALETVQRLLPVLCQD-HGLTPDQVVAIASHDGGKQALETVQRLLPVLCQ-DHGLTPD 886
Cdd:NF041308 564 CQaPHGLTPEQVVAIASHNGGKQALETVQRLLPVLCQRpYGLTPNQVVAIASNDGGKQALETVQRLLPVLCQaPHGLTPD 643
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 887 QVVAIASNGGGKQALETVQRLLPVLCQD-HGLTPDQVVAIASHDGGKQALETVQRLLPVLCQ-DHGLTPDQVVAIASNIG 964
Cdd:NF041308 644 QVVAIASNGGGKQALETVQRLLPVLCQRpHGLTPHQVVAIASNDGGKQALETVQRLLPVLCQpPYGLTPEQVVAIASNNG 723
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 965 GKQALETVQRLLPVLCQD-HGLTPDQVVAIASNGGGKQALETVQRLLPVLCQD-HGLTPDQVVAIASNGGGKQALETVQR 1042
Cdd:NF041308 724 GKQALETVQRLLPVLCQRpHGLTPDQVVAIASNDGGKQALETVQRLLPVLCQPpHGLTPDQVVAIASNDGGKQALETVQR 803
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 1043 LLPVLCQD-----------------------------------HGLTPDQVVAIASHDGGKQALETVQRLLPVLCQ-DHG 1086
Cdd:NF041308 804 LLPVLCDAphgltphqvvaiasniggrqaletvqrllpvlcqaHGLTPDQVVAIASNNGGKQALETVQRLLPVLCQpPHG 883
|
890 900 910 920 930 940 950 960
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 1087 LTPDQVVAIASNGGGKQALESIVAQLSRPDPALAALTNDHLVALACLGGRPAMDAVKKGLPHAPELIRRVNRRIGERTSH 1166
Cdd:NF041308 884 LTPHQVVAIASNIGGKQALESVVAQLSSPDPALAALTNDRLVALACIGGRPALNAVKKGLPHAVALIRKMNNRVPERTAH 963
|
970 980 990 1000 1010 1020 1030 1040
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 1167 RVADYAQVVRVLEFFQCHSHPAYAFDEAMTQFGMSRNGLVQLFRRVGVTELEARGGTLPPASQRWDRILQASGMkraKPS 1246
Cdd:NF041308 964 LVADLTQVVRVLSFFQCHSNPAQAFHEAMTQFEMSRQGLLQLFRRVGVTELEARSGTLPPASQRWQRILHALGL---KPS 1040
|
1050 1060 1070 1080 1090 1100 1110 1120
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 1247 PTSAQTPDQASLHAFADSLERDLDAPSPMHEGDQTRASSRKRSRSDRAVTGPSAQQAVEVRVPEQRDALHLP--LSWRVK 1324
Cdd:NF041308 1041 SASAQTPGQESLHAFADSLERELDAPSPMQDASQAGSSSRKRSRSDDPVHGFPAQQIAEALIPEHRDAPHLLplSSWGAK 1120
|
1130
....*....|...
gi 734521677 1325 RPRTRIWGGLPDP 1337
Cdd:NF041308 1121 RRRSRIAGGLPDP 1133
|
|
| GFP |
pfam01353 |
Green fluorescent protein; |
10-220 |
1.50e-48 |
|
Green fluorescent protein;
Pssm-ID: 426217 Cd Length: 211 Bit Score: 171.98 E-value: 1.50e-48
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 10 MHMKLYMEGTVNNHHFKCTSEGEGKPYEGTQTMRIKAVEGgPLPFAFDILATSFMYgsKTFINHTQGiPDFFKQSFPEG- 88
Cdd:pfam01353 1 MTHDLHMEGSVNGHEFDIVGGGNGNPNDGSLETKVKSTKG-ALPFSPYLLAPHL*Y--YQYLPFPDG-TSPFQAAVENGg 76
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 89 FTWERVTTYEDGGVLTATQDTSLQDGCLIYNVKIRGVNFPSNGPVMQKKTLGWEASTETL-YPADGGLEGRADMALKLVG 167
Cdd:pfam01353 77 YQVHRTFKFEDGGVLTIVFTYTYEGGHIKGEFTFQGSGFPPDGPVMTKSLTGWDPSVEKMiPRNDKTLVGDINWSLKLTD 156
|
170 180 190 200 210
....*....|....*....|....*....|....*....|....*....|....
gi 734521677 168 GGHLICNLKTTYRSKKP-AKNLKMPGVYYVDRRLERIKEADKETYVEQHEVAVA 220
Cdd:pfam01353 157 GKRYRAQVVTNYTFAKPvPAGLKLPPPHFVFRKIERTGSKTEINLVEQQKAFVD 210
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
725-758 |
1.22e-08 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 51.68 E-value: 1.22e-08
10 20 30
....*....|....*....|....*....|....
gi 734521677 725 NGGKQALETVQRLLPVLCQdHGLTPDQVVAIASN 758
Cdd:pfam03377 1 DGGAQALEAVLEHGPALRQ-RGFSRADIVKIAGN 33
|
|
| PksD |
COG3321 |
Acyl transferase domain in polyketide synthase (PKS) enzymes [Secondary metabolites ... |
210-722 |
4.97e-05 |
|
Acyl transferase domain in polyketide synthase (PKS) enzymes [Secondary metabolites biosynthesis, transport and catabolism];
Pssm-ID: 442550 [Multi-domain] Cd Length: 1386 Bit Score: 47.94 E-value: 4.97e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 210 TYVEQHEVAvarycdLPSKLGHRQLEGRGSLLTCGDVEENPGPTGLSTIRPRRPSPARELLPGPQPDRVQPTADRGVSAP 289
Cdd:COG3321 865 TYPFQREDA------AAALLAAALAAALAAAAALGALLLAALAAALAAALLALAAAAAAALALAAAALAALLALVALAAA 938
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 290 AGSPLDGLPARRTVSRTRLPSPPAPSPAFSAGSFSDLLRPFDPSLLDTSLLDSMPAVGTPHTAAAPAEWDEAQSALRAAD 369
Cdd:COG3321 939 AAALLALAAAAAAAAAALAAAEAGALLLLAAAAAAAAAAAAAAAAAAAAAAAAAAAALAAAAALALLAAAALLLAAAAAA 1018
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 370 DPPPTVRVAVTAARPPRAKPAPRRRAAQPSDASPAAQVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVA 449
Cdd:COG3321 1019 AALLALAALLAAAAAALAAAAAAAAAAAALAALAAAAAAAAALALALAALLLLAALAELALAAAALALAAALAAAALALA 1098
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 450 LSQHPAALGTVAVTYQHIITALPEATHEDIVGVGKQWSGARALEALLTDAGELRGPPLQLDTGQLVKIAKRGGVTAMEAV 529
Cdd:COG3321 1099 LAALAAALLLLALLAALALAAAAAALLALAALLAAAAAAAALAAAAAAAAALALAAAAAALAAALAAALLAAAALLLALA 1178
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 530 HASRNALTGAPLNLTPDQVVAIASNIGGKQALETVQRLLPVLCQDHGLTPDQVVAIASNGGGKQALETVQRLLPVLCQDH 609
Cdd:COG3321 1179 LALAAALAAALAGLAALLLAALLAALLAALLALALAALAAAAAALLAAAAAAAALALLALAAAAAAVAALAAAAAALLAA 1258
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 610 GLTPDQVVAIASNGGGKQALETVQRLLPVLcqDHGLTPDQVVAIASNGGGKQALETVQRLLPVLCQDHGLTPDQVVAIAS 689
Cdd:COG3321 1259 LAALALLAAAAGLAALAAAAAAAAAALALA--AAAAAAAAALAALLAAAAAAAAAAAAAAAAAALAAALLAAALAALAAA 1336
|
490 500 510
....*....|....*....|....*....|...
gi 734521677 690 HDGGKQALETVQRLLPVLCQDHGLTPDQVVAIA 722
Cdd:COG3321 1337 VAAALALAAAAAAAAAAAAAAAAAAALAAAAGA 1369
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
244-422 |
5.76e-05 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 47.75 E-value: 5.76e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 244 GDVEENPGPTGLSTIRPRRPSPARELLPGPQPDRVQPtadrgvsaPAGSPLDGLPARRTVSRTRLPSPPAPSPAFSAGSF 323
Cdd:PHA03378 670 GHIPYQPSPTGANTMLPIQWAPGTMQPPPRAPTPMRP--------PAAPPGRAQRPAAATGRARPPAAAPGRARPPAAAP 741
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 324 SDLLRPfdpslldtsllDSMPAVGTPHTAAAPAEWDEAQSALRAADDPPPtvRVAVTAARPPRAKPAPRrraaQPSDASP 403
Cdd:PHA03378 742 GRARPP-----------AAAPGRARPPAAAPGRARPPAAAPGAPTPQPPP--QAPPAPQQRPRGAPTPQ----PPPQAGP 804
|
170 180
....*....|....*....|
gi 734521677 404 AA-QVDLRTLGYSQQQQEKI 422
Cdd:PHA03378 805 TSmQLMPRAAPGQQGPTKQI 824
|
|
| SepH |
NF040712 |
septation protein SepH; Septation protein H (SepH) was firstly characterized in Streptomyces ... |
258-399 |
1.25e-03 |
|
septation protein SepH; Septation protein H (SepH) was firstly characterized in Streptomyces venezuelae, and homologs were identified in Mycobacterium smegmatis. SepH contains a N-terminal DUF3071 domain and a conserved C-terminal region. It binds directly to cell division protein FtsZ to stimulate the assembly of FtsZ protofilaments.
Pssm-ID: 468676 [Multi-domain] Cd Length: 346 Bit Score: 42.83 E-value: 1.25e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 258 IRPRRPSPARELLPGPQPDRVQPTADRG--VSAPAGSPLDGLPARRTVSRTRLPSPPAPSPAFSAGSFSDLLrPFDPSLL 335
Cdd:NF040712 188 IDPDFGRPLRPLATVPRLAREPADARPEevEPAPAAEGAPATDSDPAEAGTPDDLASARRRRAGVEQPEDEP-VGPGAAP 266
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 734521677 336 DTSLLDSMPAVGTPHTAAAPAEWDEAQSALRA--ADDPPPTVRVAVTAARPPRAKPAPRRRAAQPS 399
Cdd:NF040712 267 AAEPDEATRDAGEPPAPGAAETPEAAEPPAPApaAPAAPAAPEAEEPARPEPPPAPKPKRRRRRAS 332
|
|
| SP1-4_arthropods_N |
cd22553 |
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ... |
547-832 |
8.34e-03 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.
Pssm-ID: 411778 [Multi-domain] Cd Length: 384 Bit Score: 40.01 E-value: 8.34e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 547 QVVAIASNIGGKQ---ALETVQRLLPVLCQDHGLTPDQVVAIASNGGGKQAletVQRLlpVLCQDHGLTPD--QVVAIAS 621
Cdd:cd22553 14 QVATTASNIGGQQkqaQSDSSETHDPLILSPPLSQPQQIITAQSSGSAAGG---VAYS--VSPAVQTVTVDghEAIFIPA 88
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 622 NGGGKQAleTVQRLLPvlcqdhgLTPDQVVAIASNGGgkqaletvQRLLPVLCQDHGLTPDQVVAIASHDGGKQAletVQ 701
Cdd:cd22553 89 NSGLLQT--NNQQAIQ-------LAPGGTQAILANQQ--------TLIRPNTVQGQANASNVLQNIAQIASGGNA---VQ 148
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 702 RLLPVLCQDhglTPDQVVaiASNNGGKQALETVQrllpvlcqdhglTPDQVVAIASNIGGKQALETvqRLLPVLCQDHGL 781
Cdd:cd22553 149 LPLNNMTQT---IPVQVP--VSTANGQTVYQTIQ------------VPIQAIQSGNAGGGNQALQA--QVIPQLAQAAQL 209
|
250 260 270 280 290
....*....|....*....|....*....|....*....|....*....|.
gi 734521677 782 TPDQVVAIASHDGGKQALETVQRLLPVLCQDHGLTPDQVVAIASNNGGKQA 832
Cdd:cd22553 210 QPQQLAQVSSQGYIQQIPANASQQQPQMVQQGPNQSGQIIGQVASASSIQA 260
|
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| AvrBs3 |
NF041308 |
type III secretion system effector avirulence protein AvrBs3; |
258-1337 |
0e+00 |
|
type III secretion system effector avirulence protein AvrBs3;
Pssm-ID: 469205 [Multi-domain] Cd Length: 1179 Bit Score: 1771.03 E-value: 0e+00
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 258 IRPRRPSPARELLPGPQPDRVQPTADRGVSAPAGSPLDGLPARRTVSRTRLPSPPAPSPAFSAGSFSDLLRPFDPSLLDT 337
Cdd:NF041308 4 IRSRTPSPAREPQAGSQPDGVQPIAGRLVSTAASSPLDGLPARPAMSRTRQPATPAPSPAFSVGSFSDLLRQFDPSLFDP 83
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 338 SLLDSMPAVGTPHTAAAPAEWDEAQSALRAADDPPPTVRVAVTAARP--PRAKPAPRRRAAQPSDASPAAQVDLRTLGYS 415
Cdd:NF041308 84 SLFDSSPAFGAHHADAAPGEMDEVQSGLRAADDPQSHLSAAVTAPSPtpPRTQAAARRRSAQTSDASPAESVDLSTLGYT 163
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 416 QQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVTYQHIITALPEATHEDIVGVGKQWSGARALEAL 495
Cdd:NF041308 164 QQQQEQIKPNARSTVAQHHAALVGHGFTHAHIVELSKHAAALGTVADRYQAIIAVLPEATHKDIVEVGKQWSGARALQAL 243
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 496 LTDAGELRGPPLQLDTGQLVKIAKRGGVTAMEAVHASRNALTGAPLNLTPDQVVAIASNIGGKQALETVQRLLPVLCQD- 574
Cdd:NF041308 244 LMVAEELRGPPLQLDTGQLIKIAKRGGAPAVEAVHASRNALTGAPLHLTPHQVVAIASNNGGKQALETVQRLLPVLCQPp 323
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 575 HGLTPDQVVAIASNGGGKQALETVQRLLPVLCQ-DHGLTPDQVVAIASNGGGKQALETVQRLLPVLCQ-DHGLTPDQVVA 652
Cdd:NF041308 324 HGLTPEQVVAIASNDGGKQALETVQRLLPVLCQaEHGLTPDQVVAIASNIGGKPALETVQRLLPVLCQpPHGLTPDQVVA 403
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 653 IASNGGGKQALETVQRLLPVLCQD-HGLTPDQVVAIASHDGGKQALETVQRLLPVLCQDHGLTPDQVVAIASNNGGKQAL 731
Cdd:NF041308 404 IASNDGGKQALETVQRLLPVLCQApHGLTPDQVVAIASNDGGKQALETVQRLLPELCQAHGLTPDQVVAIASNGGGKQAL 483
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 732 ETVQRLLPVLCQ-DHGLTPDQVVAIASNIGGKQALETVQRLLPVLCQ-DHGLTPDQVVAIASHDGGKQALETVQRLLPVL 809
Cdd:NF041308 484 ETVQRLLPVLCQpPHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQpPHGLTPEQVVAIASHDGGKQALETVHRLLPVL 563
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 810 CQ-DHGLTPDQVVAIASNNGGKQALETVQRLLPVLCQD-HGLTPDQVVAIASHDGGKQALETVQRLLPVLCQ-DHGLTPD 886
Cdd:NF041308 564 CQaPHGLTPEQVVAIASHNGGKQALETVQRLLPVLCQRpYGLTPNQVVAIASNDGGKQALETVQRLLPVLCQaPHGLTPD 643
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 887 QVVAIASNGGGKQALETVQRLLPVLCQD-HGLTPDQVVAIASHDGGKQALETVQRLLPVLCQ-DHGLTPDQVVAIASNIG 964
Cdd:NF041308 644 QVVAIASNGGGKQALETVQRLLPVLCQRpHGLTPHQVVAIASNDGGKQALETVQRLLPVLCQpPYGLTPEQVVAIASNNG 723
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 965 GKQALETVQRLLPVLCQD-HGLTPDQVVAIASNGGGKQALETVQRLLPVLCQD-HGLTPDQVVAIASNGGGKQALETVQR 1042
Cdd:NF041308 724 GKQALETVQRLLPVLCQRpHGLTPDQVVAIASNDGGKQALETVQRLLPVLCQPpHGLTPDQVVAIASNDGGKQALETVQR 803
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 1043 LLPVLCQD-----------------------------------HGLTPDQVVAIASHDGGKQALETVQRLLPVLCQ-DHG 1086
Cdd:NF041308 804 LLPVLCDAphgltphqvvaiasniggrqaletvqrllpvlcqaHGLTPDQVVAIASNNGGKQALETVQRLLPVLCQpPHG 883
|
890 900 910 920 930 940 950 960
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 1087 LTPDQVVAIASNGGGKQALESIVAQLSRPDPALAALTNDHLVALACLGGRPAMDAVKKGLPHAPELIRRVNRRIGERTSH 1166
Cdd:NF041308 884 LTPHQVVAIASNIGGKQALESVVAQLSSPDPALAALTNDRLVALACIGGRPALNAVKKGLPHAVALIRKMNNRVPERTAH 963
|
970 980 990 1000 1010 1020 1030 1040
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 1167 RVADYAQVVRVLEFFQCHSHPAYAFDEAMTQFGMSRNGLVQLFRRVGVTELEARGGTLPPASQRWDRILQASGMkraKPS 1246
Cdd:NF041308 964 LVADLTQVVRVLSFFQCHSNPAQAFHEAMTQFEMSRQGLLQLFRRVGVTELEARSGTLPPASQRWQRILHALGL---KPS 1040
|
1050 1060 1070 1080 1090 1100 1110 1120
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 1247 PTSAQTPDQASLHAFADSLERDLDAPSPMHEGDQTRASSRKRSRSDRAVTGPSAQQAVEVRVPEQRDALHLP--LSWRVK 1324
Cdd:NF041308 1041 SASAQTPGQESLHAFADSLERELDAPSPMQDASQAGSSSRKRSRSDDPVHGFPAQQIAEALIPEHRDAPHLLplSSWGAK 1120
|
1130
....*....|...
gi 734521677 1325 RPRTRIWGGLPDP 1337
Cdd:NF041308 1121 RRRSRIAGGLPDP 1133
|
|
| GFP |
pfam01353 |
Green fluorescent protein; |
10-220 |
1.50e-48 |
|
Green fluorescent protein;
Pssm-ID: 426217 Cd Length: 211 Bit Score: 171.98 E-value: 1.50e-48
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 10 MHMKLYMEGTVNNHHFKCTSEGEGKPYEGTQTMRIKAVEGgPLPFAFDILATSFMYgsKTFINHTQGiPDFFKQSFPEG- 88
Cdd:pfam01353 1 MTHDLHMEGSVNGHEFDIVGGGNGNPNDGSLETKVKSTKG-ALPFSPYLLAPHL*Y--YQYLPFPDG-TSPFQAAVENGg 76
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 89 FTWERVTTYEDGGVLTATQDTSLQDGCLIYNVKIRGVNFPSNGPVMQKKTLGWEASTETL-YPADGGLEGRADMALKLVG 167
Cdd:pfam01353 77 YQVHRTFKFEDGGVLTIVFTYTYEGGHIKGEFTFQGSGFPPDGPVMTKSLTGWDPSVEKMiPRNDKTLVGDINWSLKLTD 156
|
170 180 190 200 210
....*....|....*....|....*....|....*....|....*....|....
gi 734521677 168 GGHLICNLKTTYRSKKP-AKNLKMPGVYYVDRRLERIKEADKETYVEQHEVAVA 220
Cdd:pfam01353 157 GKRYRAQVVTNYTFAKPvPAGLKLPPPHFVFRKIERTGSKTEINLVEQQKAFVD 210
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
725-758 |
1.22e-08 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 51.68 E-value: 1.22e-08
10 20 30
....*....|....*....|....*....|....
gi 734521677 725 NGGKQALETVQRLLPVLCQdHGLTPDQVVAIASN 758
Cdd:pfam03377 1 DGGAQALEAVLEHGPALRQ-RGFSRADIVKIAGN 33
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
691-724 |
1.24e-08 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 51.68 E-value: 1.24e-08
10 20 30
....*....|....*....|....*....|....
gi 734521677 691 DGGKQALETVQRLLPVLCQdHGLTPDQVVAIASN 724
Cdd:pfam03377 1 DGGAQALEAVLEHGPALRQ-RGFSRADIVKIAGN 33
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
793-826 |
1.24e-08 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 51.68 E-value: 1.24e-08
10 20 30
....*....|....*....|....*....|....
gi 734521677 793 DGGKQALETVQRLLPVLCQdHGLTPDQVVAIASN 826
Cdd:pfam03377 1 DGGAQALEAVLEHGPALRQ-RGFSRADIVKIAGN 33
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
861-894 |
1.24e-08 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 51.68 E-value: 1.24e-08
10 20 30
....*....|....*....|....*....|....
gi 734521677 861 DGGKQALETVQRLLPVLCQdHGLTPDQVVAIASN 894
Cdd:pfam03377 1 DGGAQALEAVLEHGPALRQ-RGFSRADIVKIAGN 33
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
929-962 |
1.24e-08 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 51.68 E-value: 1.24e-08
10 20 30
....*....|....*....|....*....|....
gi 734521677 929 DGGKQALETVQRLLPVLCQdHGLTPDQVVAIASN 962
Cdd:pfam03377 1 DGGAQALEAVLEHGPALRQ-RGFSRADIVKIAGN 33
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
1065-1098 |
1.24e-08 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 51.68 E-value: 1.24e-08
10 20 30
....*....|....*....|....*....|....
gi 734521677 1065 DGGKQALETVQRLLPVLCQdHGLTPDQVVAIASN 1098
Cdd:pfam03377 1 DGGAQALEAVLEHGPALRQ-RGFSRADIVKIAGN 33
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
827-860 |
1.66e-08 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 51.30 E-value: 1.66e-08
10 20 30
....*....|....*....|....*....|....
gi 734521677 827 NGGKQALETVQRLLPVLCQdHGLTPDQVVAIASH 860
Cdd:pfam03377 1 DGGAQALEAVLEHGPALRQ-RGFSRADIVKIAGN 33
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
555-588 |
1.71e-08 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 51.30 E-value: 1.71e-08
10 20 30
....*....|....*....|....*....|....
gi 734521677 555 IGGKQALETVQRLLPVLCQdHGLTPDQVVAIASN 588
Cdd:pfam03377 1 DGGAQALEAVLEHGPALRQ-RGFSRADIVKIAGN 33
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
963-996 |
1.71e-08 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 51.30 E-value: 1.71e-08
10 20 30
....*....|....*....|....*....|....
gi 734521677 963 IGGKQALETVQRLLPVLCQdHGLTPDQVVAIASN 996
Cdd:pfam03377 1 DGGAQALEAVLEHGPALRQ-RGFSRADIVKIAGN 33
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
759-792 |
2.34e-08 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 50.91 E-value: 2.34e-08
10 20 30
....*....|....*....|....*....|....
gi 734521677 759 IGGKQALETVQRLLPVLCQdHGLTPDQVVAIASH 792
Cdd:pfam03377 1 DGGAQALEAVLEHGPALRQ-RGFSRADIVKIAGN 33
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
590-622 |
4.78e-08 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 50.14 E-value: 4.78e-08
10 20 30
....*....|....*....|....*....|...
gi 734521677 590 GGKQALETVQRLLPVLCQdHGLTPDQVVAIASN 622
Cdd:pfam03377 2 GGAQALEAVLEHGPALRQ-RGFSRADIVKIAGN 33
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
624-656 |
4.78e-08 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 50.14 E-value: 4.78e-08
10 20 30
....*....|....*....|....*....|...
gi 734521677 624 GGKQALETVQRLLPVLCQdHGLTPDQVVAIASN 656
Cdd:pfam03377 2 GGAQALEAVLEHGPALRQ-RGFSRADIVKIAGN 33
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
998-1030 |
4.78e-08 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 50.14 E-value: 4.78e-08
10 20 30
....*....|....*....|....*....|...
gi 734521677 998 GGKQALETVQRLLPVLCQdHGLTPDQVVAIASN 1030
Cdd:pfam03377 2 GGAQALEAVLEHGPALRQ-RGFSRADIVKIAGN 33
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
658-690 |
6.54e-08 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 49.75 E-value: 6.54e-08
10 20 30
....*....|....*....|....*....|...
gi 734521677 658 GGKQALETVQRLLPVLCQdHGLTPDQVVAIASH 690
Cdd:pfam03377 2 GGAQALEAVLEHGPALRQ-RGFSRADIVKIAGN 33
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
1032-1064 |
6.54e-08 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 49.75 E-value: 6.54e-08
10 20 30
....*....|....*....|....*....|...
gi 734521677 1032 GGKQALETVQRLLPVLCQdHGLTPDQVVAIASH 1064
Cdd:pfam03377 2 GGAQALEAVLEHGPALRQ-RGFSRADIVKIAGN 33
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
896-928 |
6.54e-08 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 49.75 E-value: 6.54e-08
10 20 30
....*....|....*....|....*....|...
gi 734521677 896 GGKQALETVQRLLPVLCQdHGLTPDQVVAIASH 928
Cdd:pfam03377 2 GGAQALEAVLEHGPALRQ-RGFSRADIVKIAGN 33
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
422-453 |
2.19e-05 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 42.44 E-value: 2.19e-05
10 20 30
....*....|....*....|....*....|..
gi 734521677 422 IKPKVRSTVAQHHEALVGHGFTHAHIVALSQH 453
Cdd:pfam03377 2 GGAQALEAVLEHGPALRQRGFSRADIVKIAGN 33
|
|
| PksD |
COG3321 |
Acyl transferase domain in polyketide synthase (PKS) enzymes [Secondary metabolites ... |
210-722 |
4.97e-05 |
|
Acyl transferase domain in polyketide synthase (PKS) enzymes [Secondary metabolites biosynthesis, transport and catabolism];
Pssm-ID: 442550 [Multi-domain] Cd Length: 1386 Bit Score: 47.94 E-value: 4.97e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 210 TYVEQHEVAvarycdLPSKLGHRQLEGRGSLLTCGDVEENPGPTGLSTIRPRRPSPARELLPGPQPDRVQPTADRGVSAP 289
Cdd:COG3321 865 TYPFQREDA------AAALLAAALAAALAAAAALGALLLAALAAALAAALLALAAAAAAALALAAAALAALLALVALAAA 938
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 290 AGSPLDGLPARRTVSRTRLPSPPAPSPAFSAGSFSDLLRPFDPSLLDTSLLDSMPAVGTPHTAAAPAEWDEAQSALRAAD 369
Cdd:COG3321 939 AAALLALAAAAAAAAAALAAAEAGALLLLAAAAAAAAAAAAAAAAAAAAAAAAAAAALAAAAALALLAAAALLLAAAAAA 1018
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 370 DPPPTVRVAVTAARPPRAKPAPRRRAAQPSDASPAAQVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVA 449
Cdd:COG3321 1019 AALLALAALLAAAAAALAAAAAAAAAAAALAALAAAAAAAAALALALAALLLLAALAELALAAAALALAAALAAAALALA 1098
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 450 LSQHPAALGTVAVTYQHIITALPEATHEDIVGVGKQWSGARALEALLTDAGELRGPPLQLDTGQLVKIAKRGGVTAMEAV 529
Cdd:COG3321 1099 LAALAAALLLLALLAALALAAAAAALLALAALLAAAAAAAALAAAAAAAAALALAAAAAALAAALAAALLAAAALLLALA 1178
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 530 HASRNALTGAPLNLTPDQVVAIASNIGGKQALETVQRLLPVLCQDHGLTPDQVVAIASNGGGKQALETVQRLLPVLCQDH 609
Cdd:COG3321 1179 LALAAALAAALAGLAALLLAALLAALLAALLALALAALAAAAAALLAAAAAAAALALLALAAAAAAVAALAAAAAALLAA 1258
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 610 GLTPDQVVAIASNGGGKQALETVQRLLPVLcqDHGLTPDQVVAIASNGGGKQALETVQRLLPVLCQDHGLTPDQVVAIAS 689
Cdd:COG3321 1259 LAALALLAAAAGLAALAAAAAAAAAALALA--AAAAAAAAALAALLAAAAAAAAAAAAAAAAAALAAALLAAALAALAAA 1336
|
490 500 510
....*....|....*....|....*....|...
gi 734521677 690 HDGGKQALETVQRLLPVLCQDHGLTPDQVVAIA 722
Cdd:COG3321 1337 VAAALALAAAAAAAAAAAAAAAAAAALAAAAGA 1369
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
244-422 |
5.76e-05 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 47.75 E-value: 5.76e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 244 GDVEENPGPTGLSTIRPRRPSPARELLPGPQPDRVQPtadrgvsaPAGSPLDGLPARRTVSRTRLPSPPAPSPAFSAGSF 323
Cdd:PHA03378 670 GHIPYQPSPTGANTMLPIQWAPGTMQPPPRAPTPMRP--------PAAPPGRAQRPAAATGRARPPAAAPGRARPPAAAP 741
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 324 SDLLRPfdpslldtsllDSMPAVGTPHTAAAPAEWDEAQSALRAADDPPPtvRVAVTAARPPRAKPAPRrraaQPSDASP 403
Cdd:PHA03378 742 GRARPP-----------AAAPGRARPPAAAPGRARPPAAAPGAPTPQPPP--QAPPAPQQRPRGAPTPQ----PPPQAGP 804
|
170 180
....*....|....*....|
gi 734521677 404 AA-QVDLRTLGYSQQQQEKI 422
Cdd:PHA03378 805 TSmQLMPRAAPGQQGPTKQI 824
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
252-405 |
5.82e-04 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 44.54 E-value: 5.82e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 252 PTGLSTIRPRRPSPARELLPGPQPDRVQPTADRGVSAPAGSPLDGLPARRTVSRTRLPSPPAPSPAFSAGSFSDlLRPFD 331
Cdd:PHA03247 2720 PLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSE-SRESL 2798
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 332 PSLLDTSlldSMPAVGTPHTAAAPAewdeaqSALRAADDPPPTVRVAVTAARPPRAKPAP--------------RRRAAQ 397
Cdd:PHA03247 2799 PSPWDPA---DPPAAVLAPAAALPP------AASPAGPLPPPTSAQPTAPPPPPGPPPPSlplggsvapggdvrRRPPSR 2869
|
....*...
gi 734521677 398 PSDASPAA 405
Cdd:PHA03247 2870 SPAAKPAA 2877
|
|
| SepH |
NF040712 |
septation protein SepH; Septation protein H (SepH) was firstly characterized in Streptomyces ... |
258-399 |
1.25e-03 |
|
septation protein SepH; Septation protein H (SepH) was firstly characterized in Streptomyces venezuelae, and homologs were identified in Mycobacterium smegmatis. SepH contains a N-terminal DUF3071 domain and a conserved C-terminal region. It binds directly to cell division protein FtsZ to stimulate the assembly of FtsZ protofilaments.
Pssm-ID: 468676 [Multi-domain] Cd Length: 346 Bit Score: 42.83 E-value: 1.25e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 258 IRPRRPSPARELLPGPQPDRVQPTADRG--VSAPAGSPLDGLPARRTVSRTRLPSPPAPSPAFSAGSFSDLLrPFDPSLL 335
Cdd:NF040712 188 IDPDFGRPLRPLATVPRLAREPADARPEevEPAPAAEGAPATDSDPAEAGTPDDLASARRRRAGVEQPEDEP-VGPGAAP 266
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 734521677 336 DTSLLDSMPAVGTPHTAAAPAEWDEAQSALRA--ADDPPPTVRVAVTAARPPRAKPAPRRRAAQPS 399
Cdd:NF040712 267 AAEPDEATRDAGEPPAPGAAETPEAAEPPAPApaAPAAPAAPEAEEPARPEPPPAPKPKRRRRRAS 332
|
|
| TAL_effector |
pfam03377 |
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair ... |
520-554 |
1.54e-03 |
|
TAL effector repeat; The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.
Pssm-ID: 397449 [Multi-domain] Cd Length: 33 Bit Score: 37.43 E-value: 1.54e-03
10 20 30
....*....|....*....|....*....|....*
gi 734521677 520 RGGVTAMEAVHASRNALTGAplNLTPDQVVAIASN 554
Cdd:pfam03377 1 DGGAQALEAVLEHGPALRQR--GFSRADIVKIAGN 33
|
|
| PRK10307 |
PRK10307 |
colanic acid biosynthesis glycosyltransferase WcaI; |
950-1008 |
1.54e-03 |
|
colanic acid biosynthesis glycosyltransferase WcaI;
Pssm-ID: 236670 [Multi-domain] Cd Length: 412 Bit Score: 42.66 E-value: 1.54e-03
10 20 30 40 50 60
....*....|....*....|....*....|....*....|....*....|....*....|....
gi 734521677 950 GLTPDQVVAIAS-NIGGKQALETV----QRLlpvlcQDHgltPDQVVAIASNGGGKQALETVQR 1008
Cdd:PRK10307 224 GLPDGKKIVLYSgNIGEKQGLELVidaaRRL-----RDR---PDLIFVICGQGGGKARLEKMAQ 279
|
|
| PRK10307 |
PRK10307 |
colanic acid biosynthesis glycosyltransferase WcaI; |
541-600 |
1.62e-03 |
|
colanic acid biosynthesis glycosyltransferase WcaI;
Pssm-ID: 236670 [Multi-domain] Cd Length: 412 Bit Score: 42.66 E-value: 1.62e-03
10 20 30 40 50 60
....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 734521677 541 LNLTPDQVVAIAS-NIGGKQALETV----QRLlpvlcQDHgltPDQVVAIASNGGGKQALETVQR 600
Cdd:PRK10307 223 LGLPDGKKIVLYSgNIGEKQGLELVidaaRRL-----RDR---PDLIFVICGQGGGKARLEKMAQ 279
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
342-405 |
1.86e-03 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 42.67 E-value: 1.86e-03
10 20 30 40 50 60
....*....|....*....|....*....|....*....|....*....|....*....|....
gi 734521677 342 SMPAVGTPHTAAAPAEWDEAQSALRAADDPPPTVRVAVTAARPPRAKPAPRRRAAQPSDASPAA 405
Cdd:PRK07764 433 PAPAPAPAPPSPAGNAPAGGAPSPPPAAAPSAQPAPAPAAAPEPTAAPAPAPPAAPAPAAAPAA 496
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
260-405 |
2.03e-03 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 42.56 E-value: 2.03e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 260 PRRPSPARELLP------GPQPDRVQPTADRGVSAPAGSPLDGLPARRTVSRTRLPSPPAPSPAFSAGSFSDLLRPFDps 333
Cdd:PRK12323 423 PARRSPAPEALAaarqasARGPGGAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPPWE-- 500
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 734521677 334 lldtSLLDSMPAVGTPHTAAAPAEWDEAQSALRAADDPPPTvrvAVTAARPPRAKPAPRRRAAQPSDASPAA 405
Cdd:PRK12323 501 ----ELPPEFASPAPAQPDAAPAGWVAESIPDPATADPDDA---FETLAPAPAAAPAPRAAAATEPVVAPRP 565
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
244-407 |
3.21e-03 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 42.23 E-value: 3.21e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 244 GDVEENPGPTGLSTIRPRRPSPArellPGPQPDRVQPTADRGVSAPAGSPLDGLPARRTVSRTRLPSPPAPSPAFSA--- 320
Cdd:PHA03247 2606 GDPRGPAPPSPLPPDTHAPDPPP----PSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSppq 2681
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 321 -----------GSFSDLLRPFDPSllDTSLLDSMPAVGTPHTAAAPAEWDEAQSALRAADDPPPTVRVAVTAArPPRAKP 389
Cdd:PHA03247 2682 rprrraarptvGSLTSLADPPPPP--PTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPG-GPARPA 2758
|
170
....*....|....*...
gi 734521677 390 APRRRAAQPSDASPAAQV 407
Cdd:PHA03247 2759 RPPTTAGPPAPAPPAAPA 2776
|
|
| PRK14951 |
PRK14951 |
DNA polymerase III subunits gamma and tau; Provisional |
326-427 |
6.11e-03 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237865 [Multi-domain] Cd Length: 618 Bit Score: 40.85 E-value: 6.11e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 326 LLR--PFDPSLLDTSLLDSMPAVGTPHTAAAPAEWDEAQSALRAADDPPPTVRVAVTAARPPRAKPAPRRRAAQ----PS 399
Cdd:PRK14951 358 LLRllAFKPAAAAEAAAPAEKKTPARPEAAAPAAAPVAQAAAAPAPAAAPAAAASAPAAPPAAAPPAPVAAPAAaapaAA 437
|
90 100
....*....|....*....|....*...
gi 734521677 400 DASPAAQVDLRTLGYSQQQQEKIKPKVR 427
Cdd:PRK14951 438 PAAAPAAVALAPAPPAQAAPETVAIPVR 465
|
|
| SP1-4_arthropods_N |
cd22553 |
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ... |
547-832 |
8.34e-03 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.
Pssm-ID: 411778 [Multi-domain] Cd Length: 384 Bit Score: 40.01 E-value: 8.34e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 547 QVVAIASNIGGKQ---ALETVQRLLPVLCQDHGLTPDQVVAIASNGGGKQAletVQRLlpVLCQDHGLTPD--QVVAIAS 621
Cdd:cd22553 14 QVATTASNIGGQQkqaQSDSSETHDPLILSPPLSQPQQIITAQSSGSAAGG---VAYS--VSPAVQTVTVDghEAIFIPA 88
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 622 NGGGKQAleTVQRLLPvlcqdhgLTPDQVVAIASNGGgkqaletvQRLLPVLCQDHGLTPDQVVAIASHDGGKQAletVQ 701
Cdd:cd22553 89 NSGLLQT--NNQQAIQ-------LAPGGTQAILANQQ--------TLIRPNTVQGQANASNVLQNIAQIASGGNA---VQ 148
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 734521677 702 RLLPVLCQDhglTPDQVVaiASNNGGKQALETVQrllpvlcqdhglTPDQVVAIASNIGGKQALETvqRLLPVLCQDHGL 781
Cdd:cd22553 149 LPLNNMTQT---IPVQVP--VSTANGQTVYQTIQ------------VPIQAIQSGNAGGGNQALQA--QVIPQLAQAAQL 209
|
250 260 270 280 290
....*....|....*....|....*....|....*....|....*....|.
gi 734521677 782 TPDQVVAIASHDGGKQALETVQRLLPVLCQDHGLTPDQVVAIASNNGGKQA 832
Cdd:cd22553 210 QPQQLAQVSSQGYIQQIPANASQQQPQMVQQGPNQSGQIIGQVASASSIQA 260
|
|
|