|
Name |
Accession |
Description |
Interval |
E-value |
| RHS_core |
NF041261 |
RHS element core protein; |
285-1513 |
2.11e-31 |
|
RHS element core protein;
Pssm-ID: 469161 [Multi-domain] Cd Length: 1261 Bit Score: 135.13 E-value: 2.11e-31
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 285 GQPTSGfadDPVNAATGNFVEP-QVDLGFTGGSATLeWGRVYNS----MSQVCGAFGPGWSSLAESRLVVTDESARWVQA 359
Cdd:NF041261 41 GGMTSG---NPVNPLLGAKVLPgETDIALPGPLPFI-LSRTYSSyrtrTPAPVGVFGPGWKAPSDIRLQLRDDGLILNDN 116
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 360 DGRHIVFGRM--GEGwGRAQCENLWLEQptgmGGVAFL----------------IRDNDGGSFAFSQAGRP----VFQDR 417
Cdd:NF041261 117 GGRSIHFEPLfpGEA-VYSRSESLWLVR----GGVAAQpdghtlaalwqalpedIRLSPHLYLATNSAQGPwwilGWSER 191
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 418 GPGS-----------RVaFTYSEDRLVRL-------EHEFGRAVELVWSQDGRRVVEALASDGRRVgyvydEQERLVEST 479
Cdd:NF041261 192 VPGAdevlpaplppyRV-LTGMVDRFGRTltfhreaAGDLAGEITGVTDGAGREFRLVLTTQAQRA-----EEARKQRTS 265
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 480 GDTATHRYEWNEQGLMCRVVDADGVVEVDN---------TYDGQgrvVTQRSPHGRLSRYSYL-GGRVTVVCDEDGARAN 549
Cdd:NF041261 266 SLSSPDGPRPLSSSAFPDTLPGGTEYGPDNgirlsavwlTHDPA---YPESLPAAPLVRYTYTeAGELLAVYDRSNTQVR 342
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 550 TYVHDGK--GRLVGvvdaedHRQSaswdrfgnqvmvtdreGRTVVR-SFDTRGHVIAEQTVSGARIEQDFdEQDRLIEVR 626
Cdd:NF041261 343 AFTYDAQhpGRMVA------HRYA----------------GRPEMCyRYDDTGRVTEQLNPAGLSYRYQY-EQDRITITD 399
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 627 VINDDQV--------------------SVTEMVYQGDNRNPARiVDPVGGVTVLEWD--GSLLTKVVDPTGVTLTMGYDA 684
Cdd:NF041261 400 SLNRREVlhtegegglkrvvkkehadgSVTRSGYDAAGRLTAQ-TDAAGRRTEYSLNvvSGDITDITTPDGRETKFYYND 478
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 685 HGDLVSTTNAAGHTARLERDEQGRVVAAITPLGHTTRYRYDE--RGMCVERIDPDGAVWRFEYSAGGRLLATVDPLGGRV 762
Cdd:NF041261 479 GNQLTSVTSPDGLESRREYDEPGRLVSETSRSGETTRYRYDDphSELPATTTDATGSTKQMTWSRYGQLLAFTDCSGYQT 558
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 763 EVERDEAGEESATIDELGRRVERRVDDLGNLTrvelpdgsawqfsydamsrmqSMKDASGGVWQFSYDAEGTLKATTDAt 842
Cdd:NF041261 559 RYEYDRFGQMTAVHREEGISTYRRYDNRGQLT---------------------SVKDAQGRETRYEYNAAGDLTAVITP- 616
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 843 ggvrryetnqmalptayldDGKREEAVYDRLGRMVCHINADATSTkTRYDLAGLPVEIIDEAGGVTSIERDLAGRPVAVT 922
Cdd:NF041261 617 -------------------DGNRSETQYDAWGKAVSTTQGGLTRS-MEYDAAGRITTLTNENGSHSTFLYDALDRLVQQR 676
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 923 QPMGQTFRYEYDECGrwaatistggdryemiydadgriegevwptgeRVTTTFDEcgravarrepgrGLTRL-KYDKLGR 1001
Cdd:NF041261 677 GFDGRTQRYHYDLTG--------------------------------KLTQSEDE------------GLVTLwHYDESDR 712
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1002 VVWSQDSWYGTRRFRYDAAGQMSEVVNAAGG---VTHFAYDELGRigevtdpMGGVTRHTYDP-MGRLLTSTDplgrvTR 1077
Cdd:NF041261 713 ITHRTVNGEPAEQWQYDEHGWLTDISHLSEGhrvAVHYGYDDKGR-------LTGERQTVENPeTGELLWQHE-----TG 780
|
890 900 910 920 930 940 950 960
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1078 YSYDAAGRVTRRVDGAGASMSWVYDSAGRRVEEWVNDTLLAATTRDFVGRTTT----IVEGDARHELR--FDVRGNLLWR 1151
Cdd:NF041261 781 HAYNEQGLANRVTPDSLPPVEWLTYGSGYLAGMKLGGTPLVEYTRDRLHRETVrsfgGAGSNAAYELTtaYTPAGQLQSQ 860
|
970 980 990 1000 1010 1020 1030 1040
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1152 GRGDQGVRWEYDQNSRRTcMIRPNG--QSTSYEYDANNRVSAFIQEGLG---RVVIDRDSLGRIV---SVFADGLYASWE 1223
Cdd:NF041261 861 HLNSLVYDRDYTWNDNGD-LVRISGprQTREYGYSATGRLTGVHTTAANldiRIPYATDPAGNRLpdpELHPDSTLTAWP 939
|
1050 1060 1070 1080 1090 1100 1110 1120
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1224 yadgavvRQRVERNgfiSESVITRDENGRvVADKTDGI-----------TTFYSYDQTGQLV---RAQTSEGLVTT-WEY 1288
Cdd:NF041261 940 -------DNRIAED---AHYVYRYDEYGR-LTEKTDRIpegvirtdderTHHYHYDSQHRLVfytRIQHGEPLVESrYLY 1008
|
1130 1140 1150 1160 1170 1180 1190 1200
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1289 DANGRMVvedaAGTVSRFTYDAASQLvSVTNPDGTTTYAYdDAGRRVREQGPAGERRFTWDPRGFLDSITQINHDGDKVA 1368
Cdd:NF041261 1009 DPLGRRM----AKRVWRRERDLTGWM-SLSRKPEVTWYGW-DGDRLTTVQTDTTRIQTVYQPGSFTPLIRVETENGERAK 1082
|
1210 1220 1230 1240 1250 1260 1270 1280
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1369 AQtQRLWVDALGELARVDQDSVwwdssSFMPTLVQYGARsvVTEAGVTGMVDGPDASWL---------------PPQWSA 1433
Cdd:NF041261 1083 AQ-RRSLAETLQQEGSENGHGV-----VFPAELVRMLDR--LEEEIRADRVSEESRAWLaqcgltveqmarqvePEYTPA 1154
|
1290 1300 1310 1320 1330 1340 1350 1360
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1434 R-------DHGGNPMDPWAVATNVGVSAQ----GSLLVQ---------------------GMEWMQARVYDPATRGFLST 1481
Cdd:NF041261 1155 RklhlyhcDHRGLPLALISEEGNTAWQGEydewGNLLNEenphhlqqpyrlpgqqydeesGLYYNRNRYYDPLQGRYITQ 1234
|
1370 1380 1390
....*....|....*....|....*....|..
gi 550741602 1482 DPLpGVAGaGWsgNPYAFAGNdPVNFSDPLGL 1513
Cdd:NF041261 1235 DPI-GLKG-GW--NLYQYPLN-PIRFIDPLGL 1261
|
|
| RhsA |
COG3209 |
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ... |
510-1541 |
9.59e-21 |
|
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];
Pssm-ID: 442442 [Multi-domain] Cd Length: 1103 Bit Score: 100.22 E-value: 9.59e-21
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 510 TYDGQGRVVTQRSPHGRLSRYSYLGGRVTVVCDEDGARANTYVHDGKGRLVGVVDAEDHRQSASWDRFGNQVMVTDREGR 589
Cdd:COG3209 14 SSTLLAATNAGGGTAVTNAGSTVLLAKGGLSTAAAAGGAATLTARSASTTDVVGTLTGAGGTSAGGVTALGDASAAGGGY 93
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 590 TVVRSFDTRGHVIAEQTVSGARIEQDFDEQDRLIEVRVINDDQVSVTEMVYQGDNRNPARIVDPVGGVTVLEWDGSLLTK 669
Cdd:COG3209 94 VGGAAAGGGATLTGLAAATASAGRLVSTGAGAGGTVTAATGGTLGATAGSATTGSTDGGRGGVAVTGLAGGGASAYGLTL 173
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 670 VVDPTGVTLTMGYDAHGDLVSTTNAA----GHTARLERDEQGRVVAAITPLGHTTRYRYDERGMCVERIDPDGAVWRFEY 745
Cdd:COG3209 174 GGAAAGPATGVGTGAVTLATGLAGSAllalGSGAILGGLAGAYSGSATTATGTALGTPASVAATVTGSATGAAGAGAAVA 253
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 746 SAGGRLLATVDPLGGRVEVERDEAGEESATIDELGRRVERRVDDLGNLTRVELPDGSAWQFSYDAMSRMQSMKDASGGVW 825
Cdd:COG3209 254 TAATTLGGTTGAGTGASGAGLDASTGTGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAG 333
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 826 QFSYDAEGTLKATTDATGGVRRYETNQMALPTAYLDDGKREEAVYDRLGRMVCHINADATSTKTRYDLAGLPVEIIDEAG 905
Cdd:COG3209 334 TTTTTGTGTGGTTTTVGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTT 413
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 906 GVTSIERDLAGRPVAVTQPMGQTFRYEYDECGRWAATISTGGDRYEMIYDADGRIEGEVWPTGERVTTTFDECGRAVARR 985
Cdd:COG3209 414 GGDGGPATAAGALTAGGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGA 493
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 986 EPGRGLTRLKYDKLGRVVWSQDSWYGTRRFRYDAAGQMSEVVNAAGGVTHFAYDELGRIGEVTDPMGGVTRHTYDPMGRL 1065
Cdd:COG3209 494 TTLGTDTTLDDTLGGTTTTTAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGD 573
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1066 LTSTDPLGRVTRYSYDAAGRVTRRVDGAGASMSWVYDSAGRRVEEWVNDTLLAATTRDFVGRTTTIVEGDARHELRFDVR 1145
Cdd:COG3209 574 GTGGASTTTGTTGGTATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGTGVTTTG 653
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1146 GNLLWRGRGDQGVRWEYDQNSRRTCMIRPNGQSTSYEYDANNRVSAFIQEGLGRVVIDRDSLGRIVSVFADGLYASWEYA 1225
Cdd:COG3209 654 TTTTRATGTTGTGTGVTAGLTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTD 733
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1226 DGAVVRQRVERNG-------FISESVITRDENGRVVADKT------DGITTFYSYDQTGQLVRAQTSEGLVTTWEYDANG 1292
Cdd:COG3209 734 GTGTGGTTGTLTTtstttttTAGALTYTYDALGRLTSETTpggvtqGTYTTRYTYDALGRLTSVTYPDGETVTYTYDALG 813
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1293 RMV----VEDAAGTVS---RFTYDAASQLVSVTN----PDGTTTYAYDDAGRRVREQGPAGERRFTWDPRGFLDSitqin 1361
Cdd:COG3209 814 RLTsvitVGSGGGTDLqdrTYTYDAAGNITSITDalraGTLTQTYTYDALGRLTSATDPGTTESYTYDANGNLTS----- 888
|
890 900 910 920 930 940 950 960
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1362 hdgdKVAAQTQRLWVDALGELARVDQDSVwwDSSSFmptlvQYGARSVVTEAG-VTGMVDGPDAswlpPQWSARdhggnp 1440
Cdd:COG3209 889 ----RTDGGTTTYTYDALGRLVSVTKPDG--TTTTY-----TYDALGHTDHLGsVRALTDASGQ----VVWRYD------ 947
|
970 980 990 1000 1010 1020 1030 1040
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1441 MDPW-AVATNVGVSAQGSLLVQGMEW--------MQARVYDPATRGFLSTDPLPGVAGAgwsgNPYAFAGNDPVNFSDPL 1511
Cdd:COG3209 948 YDPFgNLLAETSGAAANPLRFTGQEYdaetglyyNGARYYDPALGRFLSPDPIGLAGGL----NLYAYVGNNPVNYVDPL 1023
|
1050 1060 1070
....*....|....*....|....*....|
gi 550741602 1512 GLRPLTDEDLKGYRDAARSPLAKAADAAGG 1541
Cdd:COG3209 1024 GLAALLGTTGLGGGAGVGAGAAGGGAAAAG 1053
|
|
| DUF6531 |
pfam20148 |
Domain of unknown function (DUF6531); This putative domain is found in a range of RHS proteins. |
294-366 |
1.87e-20 |
|
Domain of unknown function (DUF6531); This putative domain is found in a range of RHS proteins.
Pssm-ID: 466309 [Multi-domain] Cd Length: 74 Bit Score: 86.82 E-value: 1.87e-20
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 550741602 294 DPVNAATGNFVEPQVDLGFtGGSATLEWGRVYNSMSQVCGAFGPGWSSLAESRLVV-TDESARWVQADGRHIVF 366
Cdd:pfam20148 2 DPVNVATGNKVLEETDFSL-PGPLPLVWTRTYNSSSERDGPLGPGWSHPYDQRLELeGDGGVVYIDADGREVTF 74
|
|
| PT-HINT |
pfam07591 |
Pretoxin HINT domain; A member of the HINT superfamily of proteases that is usually found ... |
1700-1828 |
1.00e-18 |
|
Pretoxin HINT domain; A member of the HINT superfamily of proteases that is usually found N-terminal to the toxin module in polymorphic toxin systems. The domain is predicted to function in releasing the toxin domain by autoproteolysis.
Pssm-ID: 400120 [Multi-domain] Cd Length: 136 Bit Score: 84.23 E-value: 1.00e-18
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1700 VLMADGTsKNIEDVVAGDMVAAYNPETGQAEPGEVTDTYIHDQVATWQVTTESGTVTTTAV-----HPFYVDGkGWVEAQ 1774
Cdd:pfam07591 2 VLTADGY-KAIANIKAGDRVIAKDEASGKTGYKPVTATYGNPYQETVYITISDGIGNSQTLisnfiHPFYSKG-KWIEAG 79
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....*..
gi 550741602 1775 DLRAGDQLYDETGSLVTVVSIQATGQIATVYNVTVKDLHNYYVADD---SSWVLAHN 1828
Cdd:pfam07591 80 DLKVGDKLLDESGNVQTVENIKLKDKPLKAYNLTVADWHTYFVKGNqaeTEGVWVHN 136
|
|
| Hint |
cd00081 |
Hedgehog/Intein domain, found in Hedgehog proteins as well as proteins which contain inteins ... |
1693-1828 |
2.41e-14 |
|
Hedgehog/Intein domain, found in Hedgehog proteins as well as proteins which contain inteins and undergo protein splicing (e.g. DnaB, RIR1-2, GyrA and Pol). In protein splicing an intervening polypeptide sequence - the intein - is excised from a protein, and the flanking polypeptide sequences - the exteins - are joined by a peptide bond. In addition to the autocatalytic splicing domain, many inteins contain an inserted endonuclease domain, which plays a role in spreading inteins. Hedgehog proteins are a major class of intercellular signaling molecules, which control inductive interactions during animal development. The mature signaling forms of hedgehog proteins are the N-terminal fragments, which are covalently linked to cholesterol at their C-termini. This modification is the result of an autoprocessing step catalyzed by the C-terminal fragments, which are aligned here.
Pssm-ID: 238035 [Multi-domain] Cd Length: 136 Bit Score: 71.92 E-value: 2.41e-14
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1693 CFVAGTQVLMADGTSKNIEDVVA--GDMVAAYNpETGQAEPGEVtdTYIHDQVATWQVTTESGTVTTT----AVHPFYV- 1765
Cdd:cd00081 1 CFTGDTLVLLEDGGRKKIEELVEkkGDKVLALD-ETGKLVFSKV--LKVLRRDYEKKFYKIKTESGREitltPDHLLFVl 77
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 550741602 1766 --DGKGWVEAQDLRAGDQLYdeTGSLVTVVSIQATGQIATVYNVTVKDLHNYYVADdsswVLAHN 1828
Cdd:cd00081 78 edGELKWVFASDLKPGDYVL--VPVLEKVKEIEEIEYTGGVYDLTVEDNHNFIANG----VLVHN 136
|
|
| Rhs_assc_core |
TIGR03696 |
RHS repeat-associated core domain; This model represents a conserved unique core sequence ... |
1442-1513 |
2.74e-12 |
|
RHS repeat-associated core domain; This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain.
Pssm-ID: 274730 [Multi-domain] Cd Length: 77 Bit Score: 63.67 E-value: 2.74e-12
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1442 DPWAVATNVGVSAQGSLLVQGMEW--------MQARVYDPATRGFLSTDPLpGVAGaGWsgNPYAFAGNDPVNFSDPLGL 1513
Cdd:TIGR03696 2 DPYGEVLSESGAAPNPLRFTGQYYdaetglyyNGARYYDPELGRFLSPDPI-GLGG-GL--NLYAYVGNNPVNWVDPLGL 77
|
|
| Tox-REase-5 |
pfam15648 |
Restriction endonuclease fold toxin 5; A predicted toxin of the restriction endonuclease fold ... |
1853-1925 |
7.02e-09 |
|
Restriction endonuclease fold toxin 5; A predicted toxin of the restriction endonuclease fold present in bacterial polymorphic toxin systems. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5, type 6, or PrsW-peptidase dependent secretion system. Versions of this domain are also found in caudoviruses.
Pssm-ID: 464785 Cd Length: 95 Bit Score: 54.73 E-value: 7.02e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1853 ARAYELQITGRPEGYYVN---------GVEFDGFQNGE--LLDAKGlGYANLI------PAEWSTAAKQLEDAADRQLEA 1915
Cdd:pfam15648 1 ARAYQARITGFPYGPEYRvkieewlwlGVDFDGFDPAEclLLEAKA-GYDQFFdpklpkPKKFFKGADKLLEQAERQNRA 79
|
90
....*....|....
gi 550741602 1916 --AGSTP--IHWIF 1925
Cdd:pfam15648 80 arAGGPPvrLRWHF 93
|
|
| HintN |
smart00306 |
Hint (Hedgehog/Intein) domain N-terminal region; Hedgehog/Intein domain, N-terminal region. ... |
1692-1780 |
9.04e-07 |
|
Hint (Hedgehog/Intein) domain N-terminal region; Hedgehog/Intein domain, N-terminal region. Domain has been split to accommodate large insertions of endonucleases.
Pssm-ID: 197642 [Multi-domain] Cd Length: 100 Bit Score: 48.81 E-value: 9.04e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1692 GCFVAGTQVLMADGTSKNIEDVVAGDMVAAYNPETGQAEPGEVTDTYIHDQVATWQVTTESGTVTTTAV--HPFYVDGKG 1769
Cdd:smart00306 1 GCFPGDTLVLTEDGGIKKIEELEEGDKVLALDEGTLKYSPVKVFLVREPKGEKKFYRIKTENGREITLTpdHLLLVRDGG 80
|
90
....*....|....
gi 550741602 1770 ---WVEAQDLRAGD 1780
Cdd:smart00306 81 klvWVFASELKPGD 94
|
|
| Hop |
COG1372 |
Intein/homing endonuclease [Replication, recombination and repair, Mobilome: prophages, ... |
1687-1780 |
2.21e-04 |
|
Intein/homing endonuclease [Replication, recombination and repair, Mobilome: prophages, transposons];
Pssm-ID: 440983 [Multi-domain] Cd Length: 866 Bit Score: 46.43 E-value: 2.21e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1687 DVSRHGCFVAGTQVLMADGTSKNIEDVV---AGDMVAAYNPETGQAEPGEVT---DTYIHDQV-----------ATwqvt 1749
Cdd:COG1372 92 DTGTGVCLTGDTLVLTADGRLVPIGELVgsgEDVEVLSLDLDTGKLVWAPVTkvfKTGVKPVYrirtrsgreirAT---- 167
|
90 100 110
....*....|....*....|....*....|.
gi 550741602 1750 tesgtvtttAVHPFYVDgKGWVEAQDLRAGD 1780
Cdd:COG1372 168 ---------PDHPFLTL-SGWKEAGELKPGD 188
|
|
| Bacuni_01323_like |
cd12871 |
Uncharacterized protein conserved in Bacteroidetes; A well-conserved family of 16-stranded ... |
1174-1330 |
5.21e-04 |
|
Uncharacterized protein conserved in Bacteroidetes; A well-conserved family of 16-stranded beta barrels resembling outer membrane porins. The interior of the barrels is mostly occupied by an insert with partially helical structure.
Pssm-ID: 214015 [Multi-domain] Cd Length: 231 Bit Score: 43.56 E-value: 5.21e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1174 PNGQSTSYEYDANNRVSAFIQeglgrvvIDRDSLGRIVSVfadglyASWEYADGAVVrqrVERNGFISESVITRDENGRV 1253
Cdd:cd12871 14 KSTSEYTFEYDADGRLTSITT-------TQEGEAEEITYT------TTITYEPNVIT---VTDDGGKTVSTYTLNEKGYV 77
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1254 VA------DKTDGITTFYSYDQTGQLVRAQTSEGL-VTTWEYD-ANGRMV----VEDAAGTVSRFTYDAASQLVSVTNP- 1320
Cdd:cd12871 78 TScteteyGKGQLRTYTFTYNADGQLTKIVESIGTeYSTITITwNNGDIVsistKSNTEENESKITYTSDKVYNPIVNKg 157
|
170
....*....|....*
gi 550741602 1321 -----DGTTTYAYDD 1330
Cdd:cd12871 158 clmlfGLTLGYDLSD 172
|
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| RHS_core |
NF041261 |
RHS element core protein; |
285-1513 |
2.11e-31 |
|
RHS element core protein;
Pssm-ID: 469161 [Multi-domain] Cd Length: 1261 Bit Score: 135.13 E-value: 2.11e-31
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 285 GQPTSGfadDPVNAATGNFVEP-QVDLGFTGGSATLeWGRVYNS----MSQVCGAFGPGWSSLAESRLVVTDESARWVQA 359
Cdd:NF041261 41 GGMTSG---NPVNPLLGAKVLPgETDIALPGPLPFI-LSRTYSSyrtrTPAPVGVFGPGWKAPSDIRLQLRDDGLILNDN 116
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 360 DGRHIVFGRM--GEGwGRAQCENLWLEQptgmGGVAFL----------------IRDNDGGSFAFSQAGRP----VFQDR 417
Cdd:NF041261 117 GGRSIHFEPLfpGEA-VYSRSESLWLVR----GGVAAQpdghtlaalwqalpedIRLSPHLYLATNSAQGPwwilGWSER 191
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 418 GPGS-----------RVaFTYSEDRLVRL-------EHEFGRAVELVWSQDGRRVVEALASDGRRVgyvydEQERLVEST 479
Cdd:NF041261 192 VPGAdevlpaplppyRV-LTGMVDRFGRTltfhreaAGDLAGEITGVTDGAGREFRLVLTTQAQRA-----EEARKQRTS 265
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 480 GDTATHRYEWNEQGLMCRVVDADGVVEVDN---------TYDGQgrvVTQRSPHGRLSRYSYL-GGRVTVVCDEDGARAN 549
Cdd:NF041261 266 SLSSPDGPRPLSSSAFPDTLPGGTEYGPDNgirlsavwlTHDPA---YPESLPAAPLVRYTYTeAGELLAVYDRSNTQVR 342
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 550 TYVHDGK--GRLVGvvdaedHRQSaswdrfgnqvmvtdreGRTVVR-SFDTRGHVIAEQTVSGARIEQDFdEQDRLIEVR 626
Cdd:NF041261 343 AFTYDAQhpGRMVA------HRYA----------------GRPEMCyRYDDTGRVTEQLNPAGLSYRYQY-EQDRITITD 399
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 627 VINDDQV--------------------SVTEMVYQGDNRNPARiVDPVGGVTVLEWD--GSLLTKVVDPTGVTLTMGYDA 684
Cdd:NF041261 400 SLNRREVlhtegegglkrvvkkehadgSVTRSGYDAAGRLTAQ-TDAAGRRTEYSLNvvSGDITDITTPDGRETKFYYND 478
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 685 HGDLVSTTNAAGHTARLERDEQGRVVAAITPLGHTTRYRYDE--RGMCVERIDPDGAVWRFEYSAGGRLLATVDPLGGRV 762
Cdd:NF041261 479 GNQLTSVTSPDGLESRREYDEPGRLVSETSRSGETTRYRYDDphSELPATTTDATGSTKQMTWSRYGQLLAFTDCSGYQT 558
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 763 EVERDEAGEESATIDELGRRVERRVDDLGNLTrvelpdgsawqfsydamsrmqSMKDASGGVWQFSYDAEGTLKATTDAt 842
Cdd:NF041261 559 RYEYDRFGQMTAVHREEGISTYRRYDNRGQLT---------------------SVKDAQGRETRYEYNAAGDLTAVITP- 616
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 843 ggvrryetnqmalptayldDGKREEAVYDRLGRMVCHINADATSTkTRYDLAGLPVEIIDEAGGVTSIERDLAGRPVAVT 922
Cdd:NF041261 617 -------------------DGNRSETQYDAWGKAVSTTQGGLTRS-MEYDAAGRITTLTNENGSHSTFLYDALDRLVQQR 676
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 923 QPMGQTFRYEYDECGrwaatistggdryemiydadgriegevwptgeRVTTTFDEcgravarrepgrGLTRL-KYDKLGR 1001
Cdd:NF041261 677 GFDGRTQRYHYDLTG--------------------------------KLTQSEDE------------GLVTLwHYDESDR 712
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1002 VVWSQDSWYGTRRFRYDAAGQMSEVVNAAGG---VTHFAYDELGRigevtdpMGGVTRHTYDP-MGRLLTSTDplgrvTR 1077
Cdd:NF041261 713 ITHRTVNGEPAEQWQYDEHGWLTDISHLSEGhrvAVHYGYDDKGR-------LTGERQTVENPeTGELLWQHE-----TG 780
|
890 900 910 920 930 940 950 960
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1078 YSYDAAGRVTRRVDGAGASMSWVYDSAGRRVEEWVNDTLLAATTRDFVGRTTT----IVEGDARHELR--FDVRGNLLWR 1151
Cdd:NF041261 781 HAYNEQGLANRVTPDSLPPVEWLTYGSGYLAGMKLGGTPLVEYTRDRLHRETVrsfgGAGSNAAYELTtaYTPAGQLQSQ 860
|
970 980 990 1000 1010 1020 1030 1040
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1152 GRGDQGVRWEYDQNSRRTcMIRPNG--QSTSYEYDANNRVSAFIQEGLG---RVVIDRDSLGRIV---SVFADGLYASWE 1223
Cdd:NF041261 861 HLNSLVYDRDYTWNDNGD-LVRISGprQTREYGYSATGRLTGVHTTAANldiRIPYATDPAGNRLpdpELHPDSTLTAWP 939
|
1050 1060 1070 1080 1090 1100 1110 1120
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1224 yadgavvRQRVERNgfiSESVITRDENGRvVADKTDGI-----------TTFYSYDQTGQLV---RAQTSEGLVTT-WEY 1288
Cdd:NF041261 940 -------DNRIAED---AHYVYRYDEYGR-LTEKTDRIpegvirtdderTHHYHYDSQHRLVfytRIQHGEPLVESrYLY 1008
|
1130 1140 1150 1160 1170 1180 1190 1200
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1289 DANGRMVvedaAGTVSRFTYDAASQLvSVTNPDGTTTYAYdDAGRRVREQGPAGERRFTWDPRGFLDSITQINHDGDKVA 1368
Cdd:NF041261 1009 DPLGRRM----AKRVWRRERDLTGWM-SLSRKPEVTWYGW-DGDRLTTVQTDTTRIQTVYQPGSFTPLIRVETENGERAK 1082
|
1210 1220 1230 1240 1250 1260 1270 1280
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1369 AQtQRLWVDALGELARVDQDSVwwdssSFMPTLVQYGARsvVTEAGVTGMVDGPDASWL---------------PPQWSA 1433
Cdd:NF041261 1083 AQ-RRSLAETLQQEGSENGHGV-----VFPAELVRMLDR--LEEEIRADRVSEESRAWLaqcgltveqmarqvePEYTPA 1154
|
1290 1300 1310 1320 1330 1340 1350 1360
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1434 R-------DHGGNPMDPWAVATNVGVSAQ----GSLLVQ---------------------GMEWMQARVYDPATRGFLST 1481
Cdd:NF041261 1155 RklhlyhcDHRGLPLALISEEGNTAWQGEydewGNLLNEenphhlqqpyrlpgqqydeesGLYYNRNRYYDPLQGRYITQ 1234
|
1370 1380 1390
....*....|....*....|....*....|..
gi 550741602 1482 DPLpGVAGaGWsgNPYAFAGNdPVNFSDPLGL 1513
Cdd:NF041261 1235 DPI-GLKG-GW--NLYQYPLN-PIRFIDPLGL 1261
|
|
| RhsA |
COG3209 |
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ... |
510-1541 |
9.59e-21 |
|
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];
Pssm-ID: 442442 [Multi-domain] Cd Length: 1103 Bit Score: 100.22 E-value: 9.59e-21
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 510 TYDGQGRVVTQRSPHGRLSRYSYLGGRVTVVCDEDGARANTYVHDGKGRLVGVVDAEDHRQSASWDRFGNQVMVTDREGR 589
Cdd:COG3209 14 SSTLLAATNAGGGTAVTNAGSTVLLAKGGLSTAAAAGGAATLTARSASTTDVVGTLTGAGGTSAGGVTALGDASAAGGGY 93
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 590 TVVRSFDTRGHVIAEQTVSGARIEQDFDEQDRLIEVRVINDDQVSVTEMVYQGDNRNPARIVDPVGGVTVLEWDGSLLTK 669
Cdd:COG3209 94 VGGAAAGGGATLTGLAAATASAGRLVSTGAGAGGTVTAATGGTLGATAGSATTGSTDGGRGGVAVTGLAGGGASAYGLTL 173
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 670 VVDPTGVTLTMGYDAHGDLVSTTNAA----GHTARLERDEQGRVVAAITPLGHTTRYRYDERGMCVERIDPDGAVWRFEY 745
Cdd:COG3209 174 GGAAAGPATGVGTGAVTLATGLAGSAllalGSGAILGGLAGAYSGSATTATGTALGTPASVAATVTGSATGAAGAGAAVA 253
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 746 SAGGRLLATVDPLGGRVEVERDEAGEESATIDELGRRVERRVDDLGNLTRVELPDGSAWQFSYDAMSRMQSMKDASGGVW 825
Cdd:COG3209 254 TAATTLGGTTGAGTGASGAGLDASTGTGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAG 333
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 826 QFSYDAEGTLKATTDATGGVRRYETNQMALPTAYLDDGKREEAVYDRLGRMVCHINADATSTKTRYDLAGLPVEIIDEAG 905
Cdd:COG3209 334 TTTTTGTGTGGTTTTVGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTT 413
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 906 GVTSIERDLAGRPVAVTQPMGQTFRYEYDECGRWAATISTGGDRYEMIYDADGRIEGEVWPTGERVTTTFDECGRAVARR 985
Cdd:COG3209 414 GGDGGPATAAGALTAGGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGA 493
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 986 EPGRGLTRLKYDKLGRVVWSQDSWYGTRRFRYDAAGQMSEVVNAAGGVTHFAYDELGRIGEVTDPMGGVTRHTYDPMGRL 1065
Cdd:COG3209 494 TTLGTDTTLDDTLGGTTTTTAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGD 573
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1066 LTSTDPLGRVTRYSYDAAGRVTRRVDGAGASMSWVYDSAGRRVEEWVNDTLLAATTRDFVGRTTTIVEGDARHELRFDVR 1145
Cdd:COG3209 574 GTGGASTTTGTTGGTATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGTGVTTTG 653
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1146 GNLLWRGRGDQGVRWEYDQNSRRTCMIRPNGQSTSYEYDANNRVSAFIQEGLGRVVIDRDSLGRIVSVFADGLYASWEYA 1225
Cdd:COG3209 654 TTTTRATGTTGTGTGVTAGLTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTD 733
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1226 DGAVVRQRVERNG-------FISESVITRDENGRVVADKT------DGITTFYSYDQTGQLVRAQTSEGLVTTWEYDANG 1292
Cdd:COG3209 734 GTGTGGTTGTLTTtstttttTAGALTYTYDALGRLTSETTpggvtqGTYTTRYTYDALGRLTSVTYPDGETVTYTYDALG 813
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1293 RMV----VEDAAGTVS---RFTYDAASQLVSVTN----PDGTTTYAYDDAGRRVREQGPAGERRFTWDPRGFLDSitqin 1361
Cdd:COG3209 814 RLTsvitVGSGGGTDLqdrTYTYDAAGNITSITDalraGTLTQTYTYDALGRLTSATDPGTTESYTYDANGNLTS----- 888
|
890 900 910 920 930 940 950 960
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1362 hdgdKVAAQTQRLWVDALGELARVDQDSVwwDSSSFmptlvQYGARSVVTEAG-VTGMVDGPDAswlpPQWSARdhggnp 1440
Cdd:COG3209 889 ----RTDGGTTTYTYDALGRLVSVTKPDG--TTTTY-----TYDALGHTDHLGsVRALTDASGQ----VVWRYD------ 947
|
970 980 990 1000 1010 1020 1030 1040
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1441 MDPW-AVATNVGVSAQGSLLVQGMEW--------MQARVYDPATRGFLSTDPLPGVAGAgwsgNPYAFAGNDPVNFSDPL 1511
Cdd:COG3209 948 YDPFgNLLAETSGAAANPLRFTGQEYdaetglyyNGARYYDPALGRFLSPDPIGLAGGL----NLYAYVGNNPVNYVDPL 1023
|
1050 1060 1070
....*....|....*....|....*....|
gi 550741602 1512 GLRPLTDEDLKGYRDAARSPLAKAADAAGG 1541
Cdd:COG3209 1024 GLAALLGTTGLGGGAGVGAGAAGGGAAAAG 1053
|
|
| DUF6531 |
pfam20148 |
Domain of unknown function (DUF6531); This putative domain is found in a range of RHS proteins. |
294-366 |
1.87e-20 |
|
Domain of unknown function (DUF6531); This putative domain is found in a range of RHS proteins.
Pssm-ID: 466309 [Multi-domain] Cd Length: 74 Bit Score: 86.82 E-value: 1.87e-20
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 550741602 294 DPVNAATGNFVEPQVDLGFtGGSATLEWGRVYNSMSQVCGAFGPGWSSLAESRLVV-TDESARWVQADGRHIVF 366
Cdd:pfam20148 2 DPVNVATGNKVLEETDFSL-PGPLPLVWTRTYNSSSERDGPLGPGWSHPYDQRLELeGDGGVVYIDADGREVTF 74
|
|
| PT-HINT |
pfam07591 |
Pretoxin HINT domain; A member of the HINT superfamily of proteases that is usually found ... |
1700-1828 |
1.00e-18 |
|
Pretoxin HINT domain; A member of the HINT superfamily of proteases that is usually found N-terminal to the toxin module in polymorphic toxin systems. The domain is predicted to function in releasing the toxin domain by autoproteolysis.
Pssm-ID: 400120 [Multi-domain] Cd Length: 136 Bit Score: 84.23 E-value: 1.00e-18
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1700 VLMADGTsKNIEDVVAGDMVAAYNPETGQAEPGEVTDTYIHDQVATWQVTTESGTVTTTAV-----HPFYVDGkGWVEAQ 1774
Cdd:pfam07591 2 VLTADGY-KAIANIKAGDRVIAKDEASGKTGYKPVTATYGNPYQETVYITISDGIGNSQTLisnfiHPFYSKG-KWIEAG 79
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....*..
gi 550741602 1775 DLRAGDQLYDETGSLVTVVSIQATGQIATVYNVTVKDLHNYYVADD---SSWVLAHN 1828
Cdd:pfam07591 80 DLKVGDKLLDESGNVQTVENIKLKDKPLKAYNLTVADWHTYFVKGNqaeTEGVWVHN 136
|
|
| RhsA |
COG3209 |
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ... |
265-1187 |
1.21e-17 |
|
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];
Pssm-ID: 442442 [Multi-domain] Cd Length: 1103 Bit Score: 89.82 E-value: 1.21e-17
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 265 ASGISSSRDDIVVDPPTAMGGQPTSGFADDPVNAATGNFVEPQVDLGFTGGSATLEWGRVYNSMSQVCGAFGPGWSSLAE 344
Cdd:COG3209 1 ETSLGLVGGTTGASSTLLAATNAGGGTAVTNAGSTVLLAKGGLSTAAAAGGAATLTARSASTTDVVGTLTGAGGTSAGGV 80
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 345 SRLVVTDESARWVQADGRHIVFGRM-GEGWGRAQCENLWLEQPTGMGGVAFLIRDNDGGSFAFSQAGRPVFQDRGPGSRV 423
Cdd:COG3209 81 TALGDASAAGGGYVGGAAAGGGATLtGLAAATASAGRLVSTGAGAGGTVTAATGGTLGATAGSATTGSTDGGRGGVAVTG 160
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 424 AFTYSEDRLVRLEHEFGRAVELVWSQDGRRVVEALASDGRRVGYVYDEQERLVESTGDTATHRYEWNEQGLM--CRVVDA 501
Cdd:COG3209 161 LAGGGASAYGLTLGGAAAGPATGVGTGAVTLATGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPAsvAATVTG 240
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 502 DGVVEVDNTYDGQGRVVTQRSPHGRLSRYSYLGGRVTVVCDEDGARANTYVHDGKGRLVGVVDAEDHRQSASWDRFGNQV 581
Cdd:COG3209 241 SATGAAGAGAAVATAATTLGGTTGAGTGASGAGLDASTGTGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGT 320
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 582 MVTDREGRTVVRSFDT---RGHVIAEQTVSGARIEQDFDEQDRLIEVRVINDDQVSVTEMVYQGDNRNPARIVDPVGGVT 658
Cdd:COG3209 321 TGTAAVSGAADAGTTTttgTGTGGTTTTVGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSST 400
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 659 VLEWDGSLLTKVVDPTGVTLTMGYDAHGDLVSTTNAAGHTARLERDEQGRVVAAITPLGHTTRYRYDERGMCVERIDPDG 738
Cdd:COG3209 401 TGVGAGTTTTSTTGGDGGPATAAGALTAGGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAG 480
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 739 AVWRFEYSAGGR---LLATVDPLGGRVEVERDEAGEESATIDELGRRVERRVDDLGNLTRVELPDGSAWQFSYDAMSRMQ 815
Cdd:COG3209 481 TGGGTLTSGSAGattLGTDTTLDDTLGGTTTTTAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTST 560
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 816 SMKDASGGVWQFSYDAEGTLKATTDATGGVRRYETNQMALPTAYLDDGKREEAVYDRLGRMVCHINADATSTKTRYDLAG 895
Cdd:COG3209 561 GTGGTGTVTTTGDGTGGASTTTGTTGGTATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGST 640
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 896 LPVEIIDEAGGVTSIERDLAGRPVAVTQPMGQTFRYEYDECGRWAATISTGGDRYEMIYDADGRIEGEVWPTGERVTTTF 975
Cdd:COG3209 641 TGGTTGTGVTTTGTTTTRATGTTGTGTGVTAGLTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLG 720
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 976 DECGRAVARREPGRGLTRLKYDKLGRVVWSQDSWYGTRRFRYDAAGQMSEVVNAAGGVTHfaydelgrigevtdpmGGVT 1055
Cdd:COG3209 721 TTTTGGGGGTTTDGTGTGGTTGTLTTTSTTTTTTAGALTYTYDALGRLTSETTPGGVTQG----------------TYTT 784
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1056 RHTYDPMGRLLTSTDPLGRVTRYSYDAAGRVTR------RVDGAGASMSWVYDSAGRRVEEWVNDTLLAATTR---DFVG 1126
Cdd:COG3209 785 RYTYDALGRLTSVTYPDGETVTYTYDALGRLTSvitvgsGGGTDLQDRTYTYDAAGNITSITDALRAGTLTQTytyDALG 864
|
890 900 910 920 930 940
....*....|....*....|....*....|....*....|....*....|....*....|.
gi 550741602 1127 RTTTIVEGDARHELRFDVRGNLLwRGRGDQGVRWEYDQNSRRTCMIRPNGQSTSYEYDANN 1187
Cdd:COG3209 865 RLTSATDPGTTESYTYDANGNLT-SRTDGGTTTYTYDALGRLVSVTKPDGTTTTYTYDALG 924
|
|
| Hint |
cd00081 |
Hedgehog/Intein domain, found in Hedgehog proteins as well as proteins which contain inteins ... |
1693-1828 |
2.41e-14 |
|
Hedgehog/Intein domain, found in Hedgehog proteins as well as proteins which contain inteins and undergo protein splicing (e.g. DnaB, RIR1-2, GyrA and Pol). In protein splicing an intervening polypeptide sequence - the intein - is excised from a protein, and the flanking polypeptide sequences - the exteins - are joined by a peptide bond. In addition to the autocatalytic splicing domain, many inteins contain an inserted endonuclease domain, which plays a role in spreading inteins. Hedgehog proteins are a major class of intercellular signaling molecules, which control inductive interactions during animal development. The mature signaling forms of hedgehog proteins are the N-terminal fragments, which are covalently linked to cholesterol at their C-termini. This modification is the result of an autoprocessing step catalyzed by the C-terminal fragments, which are aligned here.
Pssm-ID: 238035 [Multi-domain] Cd Length: 136 Bit Score: 71.92 E-value: 2.41e-14
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1693 CFVAGTQVLMADGTSKNIEDVVA--GDMVAAYNpETGQAEPGEVtdTYIHDQVATWQVTTESGTVTTT----AVHPFYV- 1765
Cdd:cd00081 1 CFTGDTLVLLEDGGRKKIEELVEkkGDKVLALD-ETGKLVFSKV--LKVLRRDYEKKFYKIKTESGREitltPDHLLFVl 77
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 550741602 1766 --DGKGWVEAQDLRAGDQLYdeTGSLVTVVSIQATGQIATVYNVTVKDLHNYYVADdsswVLAHN 1828
Cdd:cd00081 78 edGELKWVFASDLKPGDYVL--VPVLEKVKEIEEIEYTGGVYDLTVEDNHNFIANG----VLVHN 136
|
|
| Rhs_assc_core |
TIGR03696 |
RHS repeat-associated core domain; This model represents a conserved unique core sequence ... |
1442-1513 |
2.74e-12 |
|
RHS repeat-associated core domain; This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain.
Pssm-ID: 274730 [Multi-domain] Cd Length: 77 Bit Score: 63.67 E-value: 2.74e-12
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1442 DPWAVATNVGVSAQGSLLVQGMEW--------MQARVYDPATRGFLSTDPLpGVAGaGWsgNPYAFAGNDPVNFSDPLGL 1513
Cdd:TIGR03696 2 DPYGEVLSESGAAPNPLRFTGQYYdaetglyyNGARYYDPELGRFLSPDPI-GLGG-GL--NLYAYVGNNPVNWVDPLGL 77
|
|
| RhsA |
COG3209 |
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ... |
198-1095 |
4.32e-10 |
|
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];
Pssm-ID: 442442 [Multi-domain] Cd Length: 1103 Bit Score: 65.16 E-value: 4.32e-10
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 198 VDAALAAFARSCSWVTFDASSVAAALTAWNKANVDEITWADRVRELFVNAGGTGMAVPNSALDAALQASGISSSRDDIVV 277
Cdd:COG3209 155 GVAVTGLAGGGASAYGLTLGGAAAGPATGVGTGAVTLATGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPASV 234
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 278 DPPTAMGGQPTSGFADDPVNAATGNFVEPQVDLGFTGGSATLEWGRVYNSMSQVCGAFGPGWSSLAESRLVVTDESARWV 357
Cdd:COG3209 235 AATVTGSATGAAGAGAAVATAATTLGGTTGAGTGASGAGLDASTGTGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGT 314
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 358 QADGRHIVFGRMGEGWGRAQCENLWLEQPTGMGGVAFLIRDNDGGSFAFSQAGRPVFQDRGPGSRVAFTYSEDRLVRLEH 437
Cdd:COG3209 315 TTAAGTTGTAAVSGAADAGTTTTTGTGTGGTTTTVGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGS 394
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 438 EFGRAVELVWSQDGRRVVEALASDGRRVGYVYDEQERLVESTGDTATHRYEWNEQGLMCRVVDADGVVEVDNTYDGQGRV 517
Cdd:COG3209 395 GGGSSTTGVGAGTTTTSTTGGDGGPATAAGALTAGGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTG 474
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 518 VTQRSPHGRLSRYSYLGGRVTVVCDEDGARANTYVHDGKGRLVGVVDAEDHRQSASWDRFGNQVMVTDREGRTVVRSFDT 597
Cdd:COG3209 475 GGTEAGTGGGTLTSGSAGATTLGTDTTLDDTLGGTTTTTAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTV 554
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 598 RGHVIAEQTVSGARIEQDFDEQDRLIEVRVINDDQVSVTEMVYQGDNRNPARIVDPVGGVTVLEWDGSLLTKVVDPTGVT 677
Cdd:COG3209 555 GTGTSTGTGGTGTVTTTGDGTGGASTTTGTTGGTATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERAT 634
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 678 LTMGYDAHGDLVSTTNAAGHTARLERDEQGRVVAAITPLGHTTRYRYDERGMCVERIDPDGAVWRFEYSAGGRLLATVDP 757
Cdd:COG3209 635 ASTGSTTGGTTGTGVTTTGTTTTRATGTTGTGTGVTAGLTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGG 714
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 758 LGGRVEVERDEAGEESATIDELGRRVERRVDDLGNLTRVELPDGSawqFSYDAMSRMQSMKDASG-----GVWQFSYDAE 832
Cdd:COG3209 715 TTTRLGTTTTGGGGGTTTDGTGTGGTTGTLTTTSTTTTTTAGALT---YTYDALGRLTSETTPGGvtqgtYTTRYTYDAL 791
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 833 GTLKATTDATGGVRRYEtnqmalptaylddgkreeavYDRLGRMVCHINADATSTKTRYDLaglpveiideaggvtsier 912
Cdd:COG3209 792 GRLTSVTYPDGETVTYT--------------------YDALGRLTSVITVGSGGGTDLQDR------------------- 832
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 913 dlagrpvavtqpmgqtfRYEYDECGRWAATIST---GGDRYEMIYDADGRIEGEVWPTGERVTTtfdecgravarrepgr 989
Cdd:COG3209 833 -----------------TYTYDAAGNITSITDAlraGTLTQTYTYDALGRLTSATDPGTTESYT---------------- 879
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 990 gltrlkYDKLGRVVwsQDSWYGTRRFRYDAAGQMSEVVNAAGGVTHFAYDELGrigevtdpmggvtrhTYDPMGRLLTST 1069
Cdd:COG3209 880 ------YDANGNLT--SRTDGGTTTYTYDALGRLVSVTKPDGTTTTYTYDALG---------------HTDHLGSVRALT 936
|
890 900
....*....|....*....|....*..
gi 550741602 1070 DPLGRVT-RYSYDAAGRVTRRVDGAGA 1095
Cdd:COG3209 937 DASGQVVwRYDYDPFGNLLAETSGAAA 963
|
|
| Tox-REase-5 |
pfam15648 |
Restriction endonuclease fold toxin 5; A predicted toxin of the restriction endonuclease fold ... |
1853-1925 |
7.02e-09 |
|
Restriction endonuclease fold toxin 5; A predicted toxin of the restriction endonuclease fold present in bacterial polymorphic toxin systems. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5, type 6, or PrsW-peptidase dependent secretion system. Versions of this domain are also found in caudoviruses.
Pssm-ID: 464785 Cd Length: 95 Bit Score: 54.73 E-value: 7.02e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1853 ARAYELQITGRPEGYYVN---------GVEFDGFQNGE--LLDAKGlGYANLI------PAEWSTAAKQLEDAADRQLEA 1915
Cdd:pfam15648 1 ARAYQARITGFPYGPEYRvkieewlwlGVDFDGFDPAEclLLEAKA-GYDQFFdpklpkPKKFFKGADKLLEQAERQNRA 79
|
90
....*....|....
gi 550741602 1916 --AGSTP--IHWIF 1925
Cdd:pfam15648 80 arAGGPPvrLRWHF 93
|
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
1059-1095 |
1.41e-08 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 52.22 E-value: 1.41e-08
10 20 30
....*....|....*....|....*....|....*..
gi 550741602 1059 YDPMGRLLTSTDPLGRVTRYSYDAAGRVTRRVDGAGA 1095
Cdd:pfam05593 1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDGT 37
|
|
| HintN |
smart00306 |
Hint (Hedgehog/Intein) domain N-terminal region; Hedgehog/Intein domain, N-terminal region. ... |
1692-1780 |
9.04e-07 |
|
Hint (Hedgehog/Intein) domain N-terminal region; Hedgehog/Intein domain, N-terminal region. Domain has been split to accommodate large insertions of endonucleases.
Pssm-ID: 197642 [Multi-domain] Cd Length: 100 Bit Score: 48.81 E-value: 9.04e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1692 GCFVAGTQVLMADGTSKNIEDVVAGDMVAAYNPETGQAEPGEVTDTYIHDQVATWQVTTESGTVTTTAV--HPFYVDGKG 1769
Cdd:smart00306 1 GCFPGDTLVLTEDGGIKKIEELEEGDKVLALDEGTLKYSPVKVFLVREPKGEKKFYRIKTENGREITLTpdHLLLVRDGG 80
|
90
....*....|....
gi 550741602 1770 ---WVEAQDLRAGD 1780
Cdd:smart00306 81 klvWVFASELKPGD 94
|
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
1038-1074 |
1.89e-06 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 46.05 E-value: 1.89e-06
10 20 30
....*....|....*....|....*....|....*..
gi 550741602 1038 YDELGRIGEVTDPMGGVTRHTYDPMGRLLTSTDPLGR 1074
Cdd:pfam05593 1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDGT 37
|
|
| Hom_end_hint |
pfam05203 |
Hom_end-associated Hint; Homing endonucleases are encoded by mobile DNA elements that are ... |
1693-1719 |
2.72e-06 |
|
Hom_end-associated Hint; Homing endonucleases are encoded by mobile DNA elements that are found inserted within host genes in all domains of life. The crystal structure of the homing nuclease PI-Sce revealed two domains: an endonucleolytic centre resembling the C-terminal domain of Drosophila melanogaster Hedgehog protein, and a second domain containing the protein-splicing active site. This Domain corresponds to the latter protein-splicing domain.
Pssm-ID: 368334 [Multi-domain] Cd Length: 444 Bit Score: 52.05 E-value: 2.72e-06
10 20
....*....|....*....|....*..
gi 550741602 1693 CFVAGTQVLMADGTSKNIEDVVAGDMV 1719
Cdd:pfam05203 1 CFAKGTEVLMADGSIKSIEDIEVGDKV 27
|
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
1288-1323 |
4.88e-06 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 44.90 E-value: 4.88e-06
10 20 30
....*....|....*....|....*....|....*..
gi 550741602 1288 YDANGRMV-VEDAAGTVSRFTYDAASQLVSVTNPDGT 1323
Cdd:pfam05593 1 YDAAGRLTsVTDPDGRVTTYTYDAAGRLTAVTDPDGT 37
|
|
| YD_repeat_2x |
TIGR01643 |
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ... |
1059-1099 |
5.49e-06 |
|
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.
Pssm-ID: 273728 [Multi-domain] Cd Length: 42 Bit Score: 44.89 E-value: 5.49e-06
10 20 30 40
....*....|....*....|....*....|....*....|.
gi 550741602 1059 YDPMGRLLTSTDPLGRVTRYSYDAAGRVTRRVDGAGASMSW 1099
Cdd:TIGR01643 1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRY 41
|
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
1308-1343 |
1.11e-05 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 43.74 E-value: 1.11e-05
10 20 30
....*....|....*....|....*....|....*..
gi 550741602 1308 YDAASQLVSVTNPDG-TTTYAYDDAGRRVREQGPAGE 1343
Cdd:pfam05593 1 YDAAGRLTSVTDPDGrVTTYTYDAAGRLTAVTDPDGT 37
|
|
| YD_repeat_2x |
TIGR01643 |
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ... |
682-722 |
4.11e-05 |
|
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.
Pssm-ID: 273728 [Multi-domain] Cd Length: 42 Bit Score: 42.58 E-value: 4.11e-05
10 20 30 40
....*....|....*....|....*....|....*....|.
gi 550741602 682 YDAHGDLVSTTNAAGHTARLERDEQGRVVAAITPLGHTTRY 722
Cdd:TIGR01643 1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRY 41
|
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
704-739 |
4.49e-05 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 42.20 E-value: 4.49e-05
10 20 30
....*....|....*....|....*....|....*.
gi 550741602 704 DEQGRVVAAITPLGHTTRYRYDERGMCVERIDPDGA 739
Cdd:pfam05593 2 DAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDGT 37
|
|
| YD_repeat_2x |
TIGR01643 |
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ... |
704-744 |
6.65e-05 |
|
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.
Pssm-ID: 273728 [Multi-domain] Cd Length: 42 Bit Score: 41.81 E-value: 6.65e-05
10 20 30 40
....*....|....*....|....*....|....*....|.
gi 550741602 704 DEQGRVVAAITPLGHTTRYRYDERGMCVERIDPDGAVWRFE 744
Cdd:TIGR01643 2 DAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRYE 42
|
|
| YD_repeat_2x |
TIGR01643 |
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ... |
1017-1056 |
7.63e-05 |
|
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.
Pssm-ID: 273728 [Multi-domain] Cd Length: 42 Bit Score: 41.81 E-value: 7.63e-05
10 20 30 40
....*....|....*....|....*....|....*....|
gi 550741602 1017 YDAAGQMSEVVNAAGGVTHFAYDELGRIGEVTDPMGGVTR 1056
Cdd:TIGR01643 1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTR 40
|
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
724-759 |
8.58e-05 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 41.43 E-value: 8.58e-05
10 20 30
....*....|....*....|....*....|....*.
gi 550741602 724 YDERGMCVERIDPDGAVWRFEYSAGGRLLATVDPLG 759
Cdd:pfam05593 1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDG 36
|
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
1017-1050 |
8.92e-05 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 41.43 E-value: 8.92e-05
10 20 30
....*....|....*....|....*....|....
gi 550741602 1017 YDAAGQMSEVVNAAGGVTHFAYDELGRIGEVTDP 1050
Cdd:pfam05593 1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDP 34
|
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
682-717 |
9.84e-05 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 41.05 E-value: 9.84e-05
10 20 30
....*....|....*....|....*....|....*.
gi 550741602 682 YDAHGDLVSTTNAAGHTARLERDEQGRVVAAITPLG 717
Cdd:pfam05593 1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDG 36
|
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
788-822 |
1.02e-04 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 41.05 E-value: 1.02e-04
10 20 30
....*....|....*....|....*....|....*
gi 550741602 788 DDLGNLTRVELPDGSAWQFSYDAMSRMQSMKDASG 822
Cdd:pfam05593 2 DAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDG 36
|
|
| YD_repeat_2x |
TIGR01643 |
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ... |
1288-1326 |
1.64e-04 |
|
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.
Pssm-ID: 273728 [Multi-domain] Cd Length: 42 Bit Score: 40.65 E-value: 1.64e-04
10 20 30 40
....*....|....*....|....*....|....*....|
gi 550741602 1288 YDANGRMV-VEDAAGTVSRFTYDAASQLVSVTNPDGTTTY 1326
Cdd:TIGR01643 1 YDAAGRLTgSTDADGTTTRYTYDAAGRLVEITDADGGSTR 40
|
|
| Hop |
COG1372 |
Intein/homing endonuclease [Replication, recombination and repair, Mobilome: prophages, ... |
1687-1780 |
2.21e-04 |
|
Intein/homing endonuclease [Replication, recombination and repair, Mobilome: prophages, transposons];
Pssm-ID: 440983 [Multi-domain] Cd Length: 866 Bit Score: 46.43 E-value: 2.21e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1687 DVSRHGCFVAGTQVLMADGTSKNIEDVV---AGDMVAAYNPETGQAEPGEVT---DTYIHDQV-----------ATwqvt 1749
Cdd:COG1372 92 DTGTGVCLTGDTLVLTADGRLVPIGELVgsgEDVEVLSLDLDTGKLVWAPVTkvfKTGVKPVYrirtrsgreirAT---- 167
|
90 100 110
....*....|....*....|....*....|.
gi 550741602 1750 tesgtvtttAVHPFYVDgKGWVEAQDLRAGD 1780
Cdd:COG1372 168 ---------PDHPFLTL-SGWKEAGELKPGD 188
|
|
| Bacuni_01323_like |
cd12871 |
Uncharacterized protein conserved in Bacteroidetes; A well-conserved family of 16-stranded ... |
1174-1330 |
5.21e-04 |
|
Uncharacterized protein conserved in Bacteroidetes; A well-conserved family of 16-stranded beta barrels resembling outer membrane porins. The interior of the barrels is mostly occupied by an insert with partially helical structure.
Pssm-ID: 214015 [Multi-domain] Cd Length: 231 Bit Score: 43.56 E-value: 5.21e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1174 PNGQSTSYEYDANNRVSAFIQeglgrvvIDRDSLGRIVSVfadglyASWEYADGAVVrqrVERNGFISESVITRDENGRV 1253
Cdd:cd12871 14 KSTSEYTFEYDADGRLTSITT-------TQEGEAEEITYT------TTITYEPNVIT---VTDDGGKTVSTYTLNEKGYV 77
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1254 VA------DKTDGITTFYSYDQTGQLVRAQTSEGL-VTTWEYD-ANGRMV----VEDAAGTVSRFTYDAASQLVSVTNP- 1320
Cdd:cd12871 78 TScteteyGKGQLRTYTFTYNADGQLTKIVESIGTeYSTITITwNNGDIVsistKSNTEENESKITYTSDKVYNPIVNKg 157
|
170
....*....|....*
gi 550741602 1321 -----DGTTTYAYDD 1330
Cdd:cd12871 158 clmlfGLTLGYDLSD 172
|
|
| YD_repeat_2x |
TIGR01643 |
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ... |
1308-1345 |
5.29e-04 |
|
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.
Pssm-ID: 273728 [Multi-domain] Cd Length: 42 Bit Score: 39.11 E-value: 5.29e-04
10 20 30
....*....|....*....|....*....|....*....
gi 550741602 1308 YDAASQLVSVTNPDGTTT-YAYDDAGRRVREQGPAGERR 1345
Cdd:TIGR01643 1 YDAAGRLTGSTDADGTTTrYTYDAAGRLVEITDADGGST 39
|
|
| YD_repeat_2x |
TIGR01643 |
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ... |
1038-1078 |
7.84e-04 |
|
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.
Pssm-ID: 273728 [Multi-domain] Cd Length: 42 Bit Score: 38.73 E-value: 7.84e-04
10 20 30 40
....*....|....*....|....*....|....*....|.
gi 550741602 1038 YDELGRIGEVTDPMGGVTRHTYDPMGRLLTSTDPLGRVTRY 1078
Cdd:TIGR01643 1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRY 41
|
|
| YD_repeat_2x |
TIGR01643 |
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ... |
788-827 |
1.10e-03 |
|
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.
Pssm-ID: 273728 [Multi-domain] Cd Length: 42 Bit Score: 38.34 E-value: 1.10e-03
10 20 30 40
....*....|....*....|....*....|....*....|
gi 550741602 788 DDLGNLTRVELPDGSAWQFSYDAMSRMQSMKDASGGVWQF 827
Cdd:TIGR01643 2 DAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRY 41
|
|
| YD_repeat_2x |
TIGR01643 |
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ... |
808-849 |
1.44e-03 |
|
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.
Pssm-ID: 273728 [Multi-domain] Cd Length: 42 Bit Score: 37.95 E-value: 1.44e-03
10 20 30 40
....*....|....*....|....*....|....*....|..
gi 550741602 808 YDAMSRMQSMKDASGGVWQFSYDAEGTLKATTDATGGVRRYE 849
Cdd:TIGR01643 1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRYE 42
|
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
913-948 |
2.43e-03 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 37.19 E-value: 2.43e-03
10 20 30
....*....|....*....|....*....|....*.
gi 550741602 913 DLAGRPVAVTQPMGQTFRYEYDECGRWAATISTGGD 948
Cdd:pfam05593 2 DAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDGT 37
|
|
| AXH |
smart00536 |
domain in Ataxins and HMG containing proteins; unknown function |
1693-1770 |
2.64e-03 |
|
domain in Ataxins and HMG containing proteins; unknown function
Pssm-ID: 197779 Cd Length: 116 Bit Score: 39.39 E-value: 2.64e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1693 CFVAGTQVLMADGTSKNIEDVVAGDMVAAY---NPETGQAE----------PGEVTDTY---IHDQVATWQVTTEsgtvt 1756
Cdd:smart00536 1 CFMKGTRLCLANGSNKKVEDLKTEDFIRSAgcsNDEDLQMStvkrigssglPSVVTLTFdpgVEDALLTVECQVE----- 75
|
90
....*....|....
gi 550741602 1757 ttavHPFYVDGKGW 1770
Cdd:smart00536 76 ----HPFFVKGKGW 85
|
|
| YD_repeat_2x |
TIGR01643 |
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ... |
724-765 |
3.37e-03 |
|
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.
Pssm-ID: 273728 [Multi-domain] Cd Length: 42 Bit Score: 37.18 E-value: 3.37e-03
10 20 30 40
....*....|....*....|....*....|....*....|..
gi 550741602 724 YDERGMCVERIDPDGAVWRFEYSAGGRLLATVDPLGGRVEVE 765
Cdd:TIGR01643 1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRYE 42
|
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
1267-1302 |
3.56e-03 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 36.81 E-value: 3.56e-03
10 20 30
....*....|....*....|....*....|....*..
gi 550741602 1267 YDQTGQLVRAQTSEGLVTTWEYDANGRMVVE-DAAGT 1302
Cdd:pfam05593 1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAVtDPDGT 37
|
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
1080-1110 |
5.89e-03 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 36.04 E-value: 5.89e-03
10 20 30
....*....|....*....|....*....|.
gi 550741602 1080 YDAAGRVTRRVDGAGASMSWVYDSAGRRVEE 1110
Cdd:pfam05593 1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAV 31
|
|
| YD_repeat_2x |
TIGR01643 |
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ... |
1080-1110 |
9.39e-03 |
|
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.
Pssm-ID: 273728 [Multi-domain] Cd Length: 42 Bit Score: 35.64 E-value: 9.39e-03
10 20 30
....*....|....*....|....*....|.
gi 550741602 1080 YDAAGRVTRRVDGAGASMSWVYDSAGRRVEE 1110
Cdd:TIGR01643 1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEI 31
|
|
|