NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|550741602|gb|ERS41155|]
View 

hypothetical protein HMPREF1271_00785 [Propionibacterium sp. KPL1838]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
RHS_core super family cl49306
RHS element core protein;
285-1513 2.11e-31

RHS element core protein;


The actual alignment was detected with superfamily member NF041261:

Pssm-ID: 469161 [Multi-domain]  Cd Length: 1261  Bit Score: 135.13  E-value: 2.11e-31
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  285 GQPTSGfadDPVNAATGNFVEP-QVDLGFTGGSATLeWGRVYNS----MSQVCGAFGPGWSSLAESRLVVTDESARWVQA 359
Cdd:NF041261   41 GGMTSG---NPVNPLLGAKVLPgETDIALPGPLPFI-LSRTYSSyrtrTPAPVGVFGPGWKAPSDIRLQLRDDGLILNDN 116
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  360 DGRHIVFGRM--GEGwGRAQCENLWLEQptgmGGVAFL----------------IRDNDGGSFAFSQAGRP----VFQDR 417
Cdd:NF041261  117 GGRSIHFEPLfpGEA-VYSRSESLWLVR----GGVAAQpdghtlaalwqalpedIRLSPHLYLATNSAQGPwwilGWSER 191
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  418 GPGS-----------RVaFTYSEDRLVRL-------EHEFGRAVELVWSQDGRRVVEALASDGRRVgyvydEQERLVEST 479
Cdd:NF041261  192 VPGAdevlpaplppyRV-LTGMVDRFGRTltfhreaAGDLAGEITGVTDGAGREFRLVLTTQAQRA-----EEARKQRTS 265
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  480 GDTATHRYEWNEQGLMCRVVDADGVVEVDN---------TYDGQgrvVTQRSPHGRLSRYSYL-GGRVTVVCDEDGARAN 549
Cdd:NF041261  266 SLSSPDGPRPLSSSAFPDTLPGGTEYGPDNgirlsavwlTHDPA---YPESLPAAPLVRYTYTeAGELLAVYDRSNTQVR 342
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  550 TYVHDGK--GRLVGvvdaedHRQSaswdrfgnqvmvtdreGRTVVR-SFDTRGHVIAEQTVSGARIEQDFdEQDRLIEVR 626
Cdd:NF041261  343 AFTYDAQhpGRMVA------HRYA----------------GRPEMCyRYDDTGRVTEQLNPAGLSYRYQY-EQDRITITD 399
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  627 VINDDQV--------------------SVTEMVYQGDNRNPARiVDPVGGVTVLEWD--GSLLTKVVDPTGVTLTMGYDA 684
Cdd:NF041261  400 SLNRREVlhtegegglkrvvkkehadgSVTRSGYDAAGRLTAQ-TDAAGRRTEYSLNvvSGDITDITTPDGRETKFYYND 478
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  685 HGDLVSTTNAAGHTARLERDEQGRVVAAITPLGHTTRYRYDE--RGMCVERIDPDGAVWRFEYSAGGRLLATVDPLGGRV 762
Cdd:NF041261  479 GNQLTSVTSPDGLESRREYDEPGRLVSETSRSGETTRYRYDDphSELPATTTDATGSTKQMTWSRYGQLLAFTDCSGYQT 558
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  763 EVERDEAGEESATIDELGRRVERRVDDLGNLTrvelpdgsawqfsydamsrmqSMKDASGGVWQFSYDAEGTLKATTDAt 842
Cdd:NF041261  559 RYEYDRFGQMTAVHREEGISTYRRYDNRGQLT---------------------SVKDAQGRETRYEYNAAGDLTAVITP- 616
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  843 ggvrryetnqmalptayldDGKREEAVYDRLGRMVCHINADATSTkTRYDLAGLPVEIIDEAGGVTSIERDLAGRPVAVT 922
Cdd:NF041261  617 -------------------DGNRSETQYDAWGKAVSTTQGGLTRS-MEYDAAGRITTLTNENGSHSTFLYDALDRLVQQR 676
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  923 QPMGQTFRYEYDECGrwaatistggdryemiydadgriegevwptgeRVTTTFDEcgravarrepgrGLTRL-KYDKLGR 1001
Cdd:NF041261  677 GFDGRTQRYHYDLTG--------------------------------KLTQSEDE------------GLVTLwHYDESDR 712
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1002 VVWSQDSWYGTRRFRYDAAGQMSEVVNAAGG---VTHFAYDELGRigevtdpMGGVTRHTYDP-MGRLLTSTDplgrvTR 1077
Cdd:NF041261  713 ITHRTVNGEPAEQWQYDEHGWLTDISHLSEGhrvAVHYGYDDKGR-------LTGERQTVENPeTGELLWQHE-----TG 780
                         890       900       910       920       930       940       950       960
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1078 YSYDAAGRVTRRVDGAGASMSWVYDSAGRRVEEWVNDTLLAATTRDFVGRTTT----IVEGDARHELR--FDVRGNLLWR 1151
Cdd:NF041261  781 HAYNEQGLANRVTPDSLPPVEWLTYGSGYLAGMKLGGTPLVEYTRDRLHRETVrsfgGAGSNAAYELTtaYTPAGQLQSQ 860
                         970       980       990      1000      1010      1020      1030      1040
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1152 GRGDQGVRWEYDQNSRRTcMIRPNG--QSTSYEYDANNRVSAFIQEGLG---RVVIDRDSLGRIV---SVFADGLYASWE 1223
Cdd:NF041261  861 HLNSLVYDRDYTWNDNGD-LVRISGprQTREYGYSATGRLTGVHTTAANldiRIPYATDPAGNRLpdpELHPDSTLTAWP 939
                        1050      1060      1070      1080      1090      1100      1110      1120
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1224 yadgavvRQRVERNgfiSESVITRDENGRvVADKTDGI-----------TTFYSYDQTGQLV---RAQTSEGLVTT-WEY 1288
Cdd:NF041261  940 -------DNRIAED---AHYVYRYDEYGR-LTEKTDRIpegvirtdderTHHYHYDSQHRLVfytRIQHGEPLVESrYLY 1008
                        1130      1140      1150      1160      1170      1180      1190      1200
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1289 DANGRMVvedaAGTVSRFTYDAASQLvSVTNPDGTTTYAYdDAGRRVREQGPAGERRFTWDPRGFLDSITQINHDGDKVA 1368
Cdd:NF041261 1009 DPLGRRM----AKRVWRRERDLTGWM-SLSRKPEVTWYGW-DGDRLTTVQTDTTRIQTVYQPGSFTPLIRVETENGERAK 1082
                        1210      1220      1230      1240      1250      1260      1270      1280
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1369 AQtQRLWVDALGELARVDQDSVwwdssSFMPTLVQYGARsvVTEAGVTGMVDGPDASWL---------------PPQWSA 1433
Cdd:NF041261 1083 AQ-RRSLAETLQQEGSENGHGV-----VFPAELVRMLDR--LEEEIRADRVSEESRAWLaqcgltveqmarqvePEYTPA 1154
                        1290      1300      1310      1320      1330      1340      1350      1360
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1434 R-------DHGGNPMDPWAVATNVGVSAQ----GSLLVQ---------------------GMEWMQARVYDPATRGFLST 1481
Cdd:NF041261 1155 RklhlyhcDHRGLPLALISEEGNTAWQGEydewGNLLNEenphhlqqpyrlpgqqydeesGLYYNRNRYYDPLQGRYITQ 1234
                        1370      1380      1390
                  ....*....|....*....|....*....|..
gi 550741602 1482 DPLpGVAGaGWsgNPYAFAGNdPVNFSDPLGL 1513
Cdd:NF041261 1235 DPI-GLKG-GW--NLYQYPLN-PIRFIDPLGL 1261
PT-HINT super family cl25980
Pretoxin HINT domain; A member of the HINT superfamily of proteases that is usually found ...
1700-1828 1.00e-18

Pretoxin HINT domain; A member of the HINT superfamily of proteases that is usually found N-terminal to the toxin module in polymorphic toxin systems. The domain is predicted to function in releasing the toxin domain by autoproteolysis.


The actual alignment was detected with superfamily member pfam07591:

Pssm-ID: 400120 [Multi-domain]  Cd Length: 136  Bit Score: 84.23  E-value: 1.00e-18
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  1700 VLMADGTsKNIEDVVAGDMVAAYNPETGQAEPGEVTDTYIHDQVATWQVTTESGTVTTTAV-----HPFYVDGkGWVEAQ 1774
Cdd:pfam07591    2 VLTADGY-KAIANIKAGDRVIAKDEASGKTGYKPVTATYGNPYQETVYITISDGIGNSQTLisnfiHPFYSKG-KWIEAG 79
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 550741602  1775 DLRAGDQLYDETGSLVTVVSIQATGQIATVYNVTVKDLHNYYVADD---SSWVLAHN 1828
Cdd:pfam07591   80 DLKVGDKLLDESGNVQTVENIKLKDKPLKAYNLTVADWHTYFVKGNqaeTEGVWVHN 136
Tox-REase-5 super family cl21440
Restriction endonuclease fold toxin 5; A predicted toxin of the restriction endonuclease fold ...
1853-1925 7.02e-09

Restriction endonuclease fold toxin 5; A predicted toxin of the restriction endonuclease fold present in bacterial polymorphic toxin systems. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5, type 6, or PrsW-peptidase dependent secretion system. Versions of this domain are also found in caudoviruses.


The actual alignment was detected with superfamily member pfam15648:

Pssm-ID: 464785  Cd Length: 95  Bit Score: 54.73  E-value: 7.02e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  1853 ARAYELQITGRPEGYYVN---------GVEFDGFQNGE--LLDAKGlGYANLI------PAEWSTAAKQLEDAADRQLEA 1915
Cdd:pfam15648    1 ARAYQARITGFPYGPEYRvkieewlwlGVDFDGFDPAEclLLEAKA-GYDQFFdpklpkPKKFFKGADKLLEQAERQNRA 79
                           90
                   ....*....|....
gi 550741602  1916 --AGSTP--IHWIF 1925
Cdd:pfam15648   80 arAGGPPvrLRWHF 93
 
Name Accession Description Interval E-value
RHS_core NF041261
RHS element core protein;
285-1513 2.11e-31

RHS element core protein;


Pssm-ID: 469161 [Multi-domain]  Cd Length: 1261  Bit Score: 135.13  E-value: 2.11e-31
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  285 GQPTSGfadDPVNAATGNFVEP-QVDLGFTGGSATLeWGRVYNS----MSQVCGAFGPGWSSLAESRLVVTDESARWVQA 359
Cdd:NF041261   41 GGMTSG---NPVNPLLGAKVLPgETDIALPGPLPFI-LSRTYSSyrtrTPAPVGVFGPGWKAPSDIRLQLRDDGLILNDN 116
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  360 DGRHIVFGRM--GEGwGRAQCENLWLEQptgmGGVAFL----------------IRDNDGGSFAFSQAGRP----VFQDR 417
Cdd:NF041261  117 GGRSIHFEPLfpGEA-VYSRSESLWLVR----GGVAAQpdghtlaalwqalpedIRLSPHLYLATNSAQGPwwilGWSER 191
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  418 GPGS-----------RVaFTYSEDRLVRL-------EHEFGRAVELVWSQDGRRVVEALASDGRRVgyvydEQERLVEST 479
Cdd:NF041261  192 VPGAdevlpaplppyRV-LTGMVDRFGRTltfhreaAGDLAGEITGVTDGAGREFRLVLTTQAQRA-----EEARKQRTS 265
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  480 GDTATHRYEWNEQGLMCRVVDADGVVEVDN---------TYDGQgrvVTQRSPHGRLSRYSYL-GGRVTVVCDEDGARAN 549
Cdd:NF041261  266 SLSSPDGPRPLSSSAFPDTLPGGTEYGPDNgirlsavwlTHDPA---YPESLPAAPLVRYTYTeAGELLAVYDRSNTQVR 342
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  550 TYVHDGK--GRLVGvvdaedHRQSaswdrfgnqvmvtdreGRTVVR-SFDTRGHVIAEQTVSGARIEQDFdEQDRLIEVR 626
Cdd:NF041261  343 AFTYDAQhpGRMVA------HRYA----------------GRPEMCyRYDDTGRVTEQLNPAGLSYRYQY-EQDRITITD 399
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  627 VINDDQV--------------------SVTEMVYQGDNRNPARiVDPVGGVTVLEWD--GSLLTKVVDPTGVTLTMGYDA 684
Cdd:NF041261  400 SLNRREVlhtegegglkrvvkkehadgSVTRSGYDAAGRLTAQ-TDAAGRRTEYSLNvvSGDITDITTPDGRETKFYYND 478
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  685 HGDLVSTTNAAGHTARLERDEQGRVVAAITPLGHTTRYRYDE--RGMCVERIDPDGAVWRFEYSAGGRLLATVDPLGGRV 762
Cdd:NF041261  479 GNQLTSVTSPDGLESRREYDEPGRLVSETSRSGETTRYRYDDphSELPATTTDATGSTKQMTWSRYGQLLAFTDCSGYQT 558
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  763 EVERDEAGEESATIDELGRRVERRVDDLGNLTrvelpdgsawqfsydamsrmqSMKDASGGVWQFSYDAEGTLKATTDAt 842
Cdd:NF041261  559 RYEYDRFGQMTAVHREEGISTYRRYDNRGQLT---------------------SVKDAQGRETRYEYNAAGDLTAVITP- 616
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  843 ggvrryetnqmalptayldDGKREEAVYDRLGRMVCHINADATSTkTRYDLAGLPVEIIDEAGGVTSIERDLAGRPVAVT 922
Cdd:NF041261  617 -------------------DGNRSETQYDAWGKAVSTTQGGLTRS-MEYDAAGRITTLTNENGSHSTFLYDALDRLVQQR 676
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  923 QPMGQTFRYEYDECGrwaatistggdryemiydadgriegevwptgeRVTTTFDEcgravarrepgrGLTRL-KYDKLGR 1001
Cdd:NF041261  677 GFDGRTQRYHYDLTG--------------------------------KLTQSEDE------------GLVTLwHYDESDR 712
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1002 VVWSQDSWYGTRRFRYDAAGQMSEVVNAAGG---VTHFAYDELGRigevtdpMGGVTRHTYDP-MGRLLTSTDplgrvTR 1077
Cdd:NF041261  713 ITHRTVNGEPAEQWQYDEHGWLTDISHLSEGhrvAVHYGYDDKGR-------LTGERQTVENPeTGELLWQHE-----TG 780
                         890       900       910       920       930       940       950       960
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1078 YSYDAAGRVTRRVDGAGASMSWVYDSAGRRVEEWVNDTLLAATTRDFVGRTTT----IVEGDARHELR--FDVRGNLLWR 1151
Cdd:NF041261  781 HAYNEQGLANRVTPDSLPPVEWLTYGSGYLAGMKLGGTPLVEYTRDRLHRETVrsfgGAGSNAAYELTtaYTPAGQLQSQ 860
                         970       980       990      1000      1010      1020      1030      1040
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1152 GRGDQGVRWEYDQNSRRTcMIRPNG--QSTSYEYDANNRVSAFIQEGLG---RVVIDRDSLGRIV---SVFADGLYASWE 1223
Cdd:NF041261  861 HLNSLVYDRDYTWNDNGD-LVRISGprQTREYGYSATGRLTGVHTTAANldiRIPYATDPAGNRLpdpELHPDSTLTAWP 939
                        1050      1060      1070      1080      1090      1100      1110      1120
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1224 yadgavvRQRVERNgfiSESVITRDENGRvVADKTDGI-----------TTFYSYDQTGQLV---RAQTSEGLVTT-WEY 1288
Cdd:NF041261  940 -------DNRIAED---AHYVYRYDEYGR-LTEKTDRIpegvirtdderTHHYHYDSQHRLVfytRIQHGEPLVESrYLY 1008
                        1130      1140      1150      1160      1170      1180      1190      1200
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1289 DANGRMVvedaAGTVSRFTYDAASQLvSVTNPDGTTTYAYdDAGRRVREQGPAGERRFTWDPRGFLDSITQINHDGDKVA 1368
Cdd:NF041261 1009 DPLGRRM----AKRVWRRERDLTGWM-SLSRKPEVTWYGW-DGDRLTTVQTDTTRIQTVYQPGSFTPLIRVETENGERAK 1082
                        1210      1220      1230      1240      1250      1260      1270      1280
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1369 AQtQRLWVDALGELARVDQDSVwwdssSFMPTLVQYGARsvVTEAGVTGMVDGPDASWL---------------PPQWSA 1433
Cdd:NF041261 1083 AQ-RRSLAETLQQEGSENGHGV-----VFPAELVRMLDR--LEEEIRADRVSEESRAWLaqcgltveqmarqvePEYTPA 1154
                        1290      1300      1310      1320      1330      1340      1350      1360
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1434 R-------DHGGNPMDPWAVATNVGVSAQ----GSLLVQ---------------------GMEWMQARVYDPATRGFLST 1481
Cdd:NF041261 1155 RklhlyhcDHRGLPLALISEEGNTAWQGEydewGNLLNEenphhlqqpyrlpgqqydeesGLYYNRNRYYDPLQGRYITQ 1234
                        1370      1380      1390
                  ....*....|....*....|....*....|..
gi 550741602 1482 DPLpGVAGaGWsgNPYAFAGNdPVNFSDPLGL 1513
Cdd:NF041261 1235 DPI-GLKG-GW--NLYQYPLN-PIRFIDPLGL 1261
RhsA COG3209
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ...
510-1541 9.59e-21

Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];


Pssm-ID: 442442 [Multi-domain]  Cd Length: 1103  Bit Score: 100.22  E-value: 9.59e-21
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  510 TYDGQGRVVTQRSPHGRLSRYSYLGGRVTVVCDEDGARANTYVHDGKGRLVGVVDAEDHRQSASWDRFGNQVMVTDREGR 589
Cdd:COG3209    14 SSTLLAATNAGGGTAVTNAGSTVLLAKGGLSTAAAAGGAATLTARSASTTDVVGTLTGAGGTSAGGVTALGDASAAGGGY 93
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  590 TVVRSFDTRGHVIAEQTVSGARIEQDFDEQDRLIEVRVINDDQVSVTEMVYQGDNRNPARIVDPVGGVTVLEWDGSLLTK 669
Cdd:COG3209    94 VGGAAAGGGATLTGLAAATASAGRLVSTGAGAGGTVTAATGGTLGATAGSATTGSTDGGRGGVAVTGLAGGGASAYGLTL 173
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  670 VVDPTGVTLTMGYDAHGDLVSTTNAA----GHTARLERDEQGRVVAAITPLGHTTRYRYDERGMCVERIDPDGAVWRFEY 745
Cdd:COG3209   174 GGAAAGPATGVGTGAVTLATGLAGSAllalGSGAILGGLAGAYSGSATTATGTALGTPASVAATVTGSATGAAGAGAAVA 253
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  746 SAGGRLLATVDPLGGRVEVERDEAGEESATIDELGRRVERRVDDLGNLTRVELPDGSAWQFSYDAMSRMQSMKDASGGVW 825
Cdd:COG3209   254 TAATTLGGTTGAGTGASGAGLDASTGTGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAG 333
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  826 QFSYDAEGTLKATTDATGGVRRYETNQMALPTAYLDDGKREEAVYDRLGRMVCHINADATSTKTRYDLAGLPVEIIDEAG 905
Cdd:COG3209   334 TTTTTGTGTGGTTTTVGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTT 413
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  906 GVTSIERDLAGRPVAVTQPMGQTFRYEYDECGRWAATISTGGDRYEMIYDADGRIEGEVWPTGERVTTTFDECGRAVARR 985
Cdd:COG3209   414 GGDGGPATAAGALTAGGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGA 493
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  986 EPGRGLTRLKYDKLGRVVWSQDSWYGTRRFRYDAAGQMSEVVNAAGGVTHFAYDELGRIGEVTDPMGGVTRHTYDPMGRL 1065
Cdd:COG3209   494 TTLGTDTTLDDTLGGTTTTTAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGD 573
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1066 LTSTDPLGRVTRYSYDAAGRVTRRVDGAGASMSWVYDSAGRRVEEWVNDTLLAATTRDFVGRTTTIVEGDARHELRFDVR 1145
Cdd:COG3209   574 GTGGASTTTGTTGGTATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGTGVTTTG 653
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1146 GNLLWRGRGDQGVRWEYDQNSRRTCMIRPNGQSTSYEYDANNRVSAFIQEGLGRVVIDRDSLGRIVSVFADGLYASWEYA 1225
Cdd:COG3209   654 TTTTRATGTTGTGTGVTAGLTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTD 733
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1226 DGAVVRQRVERNG-------FISESVITRDENGRVVADKT------DGITTFYSYDQTGQLVRAQTSEGLVTTWEYDANG 1292
Cdd:COG3209   734 GTGTGGTTGTLTTtstttttTAGALTYTYDALGRLTSETTpggvtqGTYTTRYTYDALGRLTSVTYPDGETVTYTYDALG 813
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1293 RMV----VEDAAGTVS---RFTYDAASQLVSVTN----PDGTTTYAYDDAGRRVREQGPAGERRFTWDPRGFLDSitqin 1361
Cdd:COG3209   814 RLTsvitVGSGGGTDLqdrTYTYDAAGNITSITDalraGTLTQTYTYDALGRLTSATDPGTTESYTYDANGNLTS----- 888
                         890       900       910       920       930       940       950       960
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1362 hdgdKVAAQTQRLWVDALGELARVDQDSVwwDSSSFmptlvQYGARSVVTEAG-VTGMVDGPDAswlpPQWSARdhggnp 1440
Cdd:COG3209   889 ----RTDGGTTTYTYDALGRLVSVTKPDG--TTTTY-----TYDALGHTDHLGsVRALTDASGQ----VVWRYD------ 947
                         970       980       990      1000      1010      1020      1030      1040
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1441 MDPW-AVATNVGVSAQGSLLVQGMEW--------MQARVYDPATRGFLSTDPLPGVAGAgwsgNPYAFAGNDPVNFSDPL 1511
Cdd:COG3209   948 YDPFgNLLAETSGAAANPLRFTGQEYdaetglyyNGARYYDPALGRFLSPDPIGLAGGL----NLYAYVGNNPVNYVDPL 1023
                        1050      1060      1070
                  ....*....|....*....|....*....|
gi 550741602 1512 GLRPLTDEDLKGYRDAARSPLAKAADAAGG 1541
Cdd:COG3209  1024 GLAALLGTTGLGGGAGVGAGAAGGGAAAAG 1053
DUF6531 pfam20148
Domain of unknown function (DUF6531); This putative domain is found in a range of RHS proteins.
294-366 1.87e-20

Domain of unknown function (DUF6531); This putative domain is found in a range of RHS proteins.


Pssm-ID: 466309 [Multi-domain]  Cd Length: 74  Bit Score: 86.82  E-value: 1.87e-20
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 550741602   294 DPVNAATGNFVEPQVDLGFtGGSATLEWGRVYNSMSQVCGAFGPGWSSLAESRLVV-TDESARWVQADGRHIVF 366
Cdd:pfam20148    2 DPVNVATGNKVLEETDFSL-PGPLPLVWTRTYNSSSERDGPLGPGWSHPYDQRLELeGDGGVVYIDADGREVTF 74
PT-HINT pfam07591
Pretoxin HINT domain; A member of the HINT superfamily of proteases that is usually found ...
1700-1828 1.00e-18

Pretoxin HINT domain; A member of the HINT superfamily of proteases that is usually found N-terminal to the toxin module in polymorphic toxin systems. The domain is predicted to function in releasing the toxin domain by autoproteolysis.


Pssm-ID: 400120 [Multi-domain]  Cd Length: 136  Bit Score: 84.23  E-value: 1.00e-18
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  1700 VLMADGTsKNIEDVVAGDMVAAYNPETGQAEPGEVTDTYIHDQVATWQVTTESGTVTTTAV-----HPFYVDGkGWVEAQ 1774
Cdd:pfam07591    2 VLTADGY-KAIANIKAGDRVIAKDEASGKTGYKPVTATYGNPYQETVYITISDGIGNSQTLisnfiHPFYSKG-KWIEAG 79
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 550741602  1775 DLRAGDQLYDETGSLVTVVSIQATGQIATVYNVTVKDLHNYYVADD---SSWVLAHN 1828
Cdd:pfam07591   80 DLKVGDKLLDESGNVQTVENIKLKDKPLKAYNLTVADWHTYFVKGNqaeTEGVWVHN 136
Hint cd00081
Hedgehog/Intein domain, found in Hedgehog proteins as well as proteins which contain inteins ...
1693-1828 2.41e-14

Hedgehog/Intein domain, found in Hedgehog proteins as well as proteins which contain inteins and undergo protein splicing (e.g. DnaB, RIR1-2, GyrA and Pol). In protein splicing an intervening polypeptide sequence - the intein - is excised from a protein, and the flanking polypeptide sequences - the exteins - are joined by a peptide bond. In addition to the autocatalytic splicing domain, many inteins contain an inserted endonuclease domain, which plays a role in spreading inteins. Hedgehog proteins are a major class of intercellular signaling molecules, which control inductive interactions during animal development. The mature signaling forms of hedgehog proteins are the N-terminal fragments, which are covalently linked to cholesterol at their C-termini. This modification is the result of an autoprocessing step catalyzed by the C-terminal fragments, which are aligned here.


Pssm-ID: 238035 [Multi-domain]  Cd Length: 136  Bit Score: 71.92  E-value: 2.41e-14
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1693 CFVAGTQVLMADGTSKNIEDVVA--GDMVAAYNpETGQAEPGEVtdTYIHDQVATWQVTTESGTVTTT----AVHPFYV- 1765
Cdd:cd00081     1 CFTGDTLVLLEDGGRKKIEELVEkkGDKVLALD-ETGKLVFSKV--LKVLRRDYEKKFYKIKTESGREitltPDHLLFVl 77
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 550741602 1766 --DGKGWVEAQDLRAGDQLYdeTGSLVTVVSIQATGQIATVYNVTVKDLHNYYVADdsswVLAHN 1828
Cdd:cd00081    78 edGELKWVFASDLKPGDYVL--VPVLEKVKEIEEIEYTGGVYDLTVEDNHNFIANG----VLVHN 136
Rhs_assc_core TIGR03696
RHS repeat-associated core domain; This model represents a conserved unique core sequence ...
1442-1513 2.74e-12

RHS repeat-associated core domain; This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain.


Pssm-ID: 274730 [Multi-domain]  Cd Length: 77  Bit Score: 63.67  E-value: 2.74e-12
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  1442 DPWAVATNVGVSAQGSLLVQGMEW--------MQARVYDPATRGFLSTDPLpGVAGaGWsgNPYAFAGNDPVNFSDPLGL 1513
Cdd:TIGR03696    2 DPYGEVLSESGAAPNPLRFTGQYYdaetglyyNGARYYDPELGRFLSPDPI-GLGG-GL--NLYAYVGNNPVNWVDPLGL 77
Tox-REase-5 pfam15648
Restriction endonuclease fold toxin 5; A predicted toxin of the restriction endonuclease fold ...
1853-1925 7.02e-09

Restriction endonuclease fold toxin 5; A predicted toxin of the restriction endonuclease fold present in bacterial polymorphic toxin systems. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5, type 6, or PrsW-peptidase dependent secretion system. Versions of this domain are also found in caudoviruses.


Pssm-ID: 464785  Cd Length: 95  Bit Score: 54.73  E-value: 7.02e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  1853 ARAYELQITGRPEGYYVN---------GVEFDGFQNGE--LLDAKGlGYANLI------PAEWSTAAKQLEDAADRQLEA 1915
Cdd:pfam15648    1 ARAYQARITGFPYGPEYRvkieewlwlGVDFDGFDPAEclLLEAKA-GYDQFFdpklpkPKKFFKGADKLLEQAERQNRA 79
                           90
                   ....*....|....
gi 550741602  1916 --AGSTP--IHWIF 1925
Cdd:pfam15648   80 arAGGPPvrLRWHF 93
HintN smart00306
Hint (Hedgehog/Intein) domain N-terminal region; Hedgehog/Intein domain, N-terminal region. ...
1692-1780 9.04e-07

Hint (Hedgehog/Intein) domain N-terminal region; Hedgehog/Intein domain, N-terminal region. Domain has been split to accommodate large insertions of endonucleases.


Pssm-ID: 197642 [Multi-domain]  Cd Length: 100  Bit Score: 48.81  E-value: 9.04e-07
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602   1692 GCFVAGTQVLMADGTSKNIEDVVAGDMVAAYNPETGQAEPGEVTDTYIHDQVATWQVTTESGTVTTTAV--HPFYVDGKG 1769
Cdd:smart00306    1 GCFPGDTLVLTEDGGIKKIEELEEGDKVLALDEGTLKYSPVKVFLVREPKGEKKFYRIKTENGREITLTpdHLLLVRDGG 80
                            90
                    ....*....|....
gi 550741602   1770 ---WVEAQDLRAGD 1780
Cdd:smart00306   81 klvWVFASELKPGD 94
Hop COG1372
Intein/homing endonuclease [Replication, recombination and repair, Mobilome: prophages, ...
1687-1780 2.21e-04

Intein/homing endonuclease [Replication, recombination and repair, Mobilome: prophages, transposons];


Pssm-ID: 440983 [Multi-domain]  Cd Length: 866  Bit Score: 46.43  E-value: 2.21e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1687 DVSRHGCFVAGTQVLMADGTSKNIEDVV---AGDMVAAYNPETGQAEPGEVT---DTYIHDQV-----------ATwqvt 1749
Cdd:COG1372    92 DTGTGVCLTGDTLVLTADGRLVPIGELVgsgEDVEVLSLDLDTGKLVWAPVTkvfKTGVKPVYrirtrsgreirAT---- 167
                          90       100       110
                  ....*....|....*....|....*....|.
gi 550741602 1750 tesgtvtttAVHPFYVDgKGWVEAQDLRAGD 1780
Cdd:COG1372   168 ---------PDHPFLTL-SGWKEAGELKPGD 188
Bacuni_01323_like cd12871
Uncharacterized protein conserved in Bacteroidetes; A well-conserved family of 16-stranded ...
1174-1330 5.21e-04

Uncharacterized protein conserved in Bacteroidetes; A well-conserved family of 16-stranded beta barrels resembling outer membrane porins. The interior of the barrels is mostly occupied by an insert with partially helical structure.


Pssm-ID: 214015 [Multi-domain]  Cd Length: 231  Bit Score: 43.56  E-value: 5.21e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1174 PNGQSTSYEYDANNRVSAFIQeglgrvvIDRDSLGRIVSVfadglyASWEYADGAVVrqrVERNGFISESVITRDENGRV 1253
Cdd:cd12871    14 KSTSEYTFEYDADGRLTSITT-------TQEGEAEEITYT------TTITYEPNVIT---VTDDGGKTVSTYTLNEKGYV 77
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1254 VA------DKTDGITTFYSYDQTGQLVRAQTSEGL-VTTWEYD-ANGRMV----VEDAAGTVSRFTYDAASQLVSVTNP- 1320
Cdd:cd12871    78 TScteteyGKGQLRTYTFTYNADGQLTKIVESIGTeYSTITITwNNGDIVsistKSNTEENESKITYTSDKVYNPIVNKg 157
                         170
                  ....*....|....*
gi 550741602 1321 -----DGTTTYAYDD 1330
Cdd:cd12871   158 clmlfGLTLGYDLSD 172
 
Name Accession Description Interval E-value
RHS_core NF041261
RHS element core protein;
285-1513 2.11e-31

RHS element core protein;


Pssm-ID: 469161 [Multi-domain]  Cd Length: 1261  Bit Score: 135.13  E-value: 2.11e-31
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  285 GQPTSGfadDPVNAATGNFVEP-QVDLGFTGGSATLeWGRVYNS----MSQVCGAFGPGWSSLAESRLVVTDESARWVQA 359
Cdd:NF041261   41 GGMTSG---NPVNPLLGAKVLPgETDIALPGPLPFI-LSRTYSSyrtrTPAPVGVFGPGWKAPSDIRLQLRDDGLILNDN 116
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  360 DGRHIVFGRM--GEGwGRAQCENLWLEQptgmGGVAFL----------------IRDNDGGSFAFSQAGRP----VFQDR 417
Cdd:NF041261  117 GGRSIHFEPLfpGEA-VYSRSESLWLVR----GGVAAQpdghtlaalwqalpedIRLSPHLYLATNSAQGPwwilGWSER 191
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  418 GPGS-----------RVaFTYSEDRLVRL-------EHEFGRAVELVWSQDGRRVVEALASDGRRVgyvydEQERLVEST 479
Cdd:NF041261  192 VPGAdevlpaplppyRV-LTGMVDRFGRTltfhreaAGDLAGEITGVTDGAGREFRLVLTTQAQRA-----EEARKQRTS 265
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  480 GDTATHRYEWNEQGLMCRVVDADGVVEVDN---------TYDGQgrvVTQRSPHGRLSRYSYL-GGRVTVVCDEDGARAN 549
Cdd:NF041261  266 SLSSPDGPRPLSSSAFPDTLPGGTEYGPDNgirlsavwlTHDPA---YPESLPAAPLVRYTYTeAGELLAVYDRSNTQVR 342
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  550 TYVHDGK--GRLVGvvdaedHRQSaswdrfgnqvmvtdreGRTVVR-SFDTRGHVIAEQTVSGARIEQDFdEQDRLIEVR 626
Cdd:NF041261  343 AFTYDAQhpGRMVA------HRYA----------------GRPEMCyRYDDTGRVTEQLNPAGLSYRYQY-EQDRITITD 399
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  627 VINDDQV--------------------SVTEMVYQGDNRNPARiVDPVGGVTVLEWD--GSLLTKVVDPTGVTLTMGYDA 684
Cdd:NF041261  400 SLNRREVlhtegegglkrvvkkehadgSVTRSGYDAAGRLTAQ-TDAAGRRTEYSLNvvSGDITDITTPDGRETKFYYND 478
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  685 HGDLVSTTNAAGHTARLERDEQGRVVAAITPLGHTTRYRYDE--RGMCVERIDPDGAVWRFEYSAGGRLLATVDPLGGRV 762
Cdd:NF041261  479 GNQLTSVTSPDGLESRREYDEPGRLVSETSRSGETTRYRYDDphSELPATTTDATGSTKQMTWSRYGQLLAFTDCSGYQT 558
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  763 EVERDEAGEESATIDELGRRVERRVDDLGNLTrvelpdgsawqfsydamsrmqSMKDASGGVWQFSYDAEGTLKATTDAt 842
Cdd:NF041261  559 RYEYDRFGQMTAVHREEGISTYRRYDNRGQLT---------------------SVKDAQGRETRYEYNAAGDLTAVITP- 616
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  843 ggvrryetnqmalptayldDGKREEAVYDRLGRMVCHINADATSTkTRYDLAGLPVEIIDEAGGVTSIERDLAGRPVAVT 922
Cdd:NF041261  617 -------------------DGNRSETQYDAWGKAVSTTQGGLTRS-MEYDAAGRITTLTNENGSHSTFLYDALDRLVQQR 676
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  923 QPMGQTFRYEYDECGrwaatistggdryemiydadgriegevwptgeRVTTTFDEcgravarrepgrGLTRL-KYDKLGR 1001
Cdd:NF041261  677 GFDGRTQRYHYDLTG--------------------------------KLTQSEDE------------GLVTLwHYDESDR 712
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1002 VVWSQDSWYGTRRFRYDAAGQMSEVVNAAGG---VTHFAYDELGRigevtdpMGGVTRHTYDP-MGRLLTSTDplgrvTR 1077
Cdd:NF041261  713 ITHRTVNGEPAEQWQYDEHGWLTDISHLSEGhrvAVHYGYDDKGR-------LTGERQTVENPeTGELLWQHE-----TG 780
                         890       900       910       920       930       940       950       960
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1078 YSYDAAGRVTRRVDGAGASMSWVYDSAGRRVEEWVNDTLLAATTRDFVGRTTT----IVEGDARHELR--FDVRGNLLWR 1151
Cdd:NF041261  781 HAYNEQGLANRVTPDSLPPVEWLTYGSGYLAGMKLGGTPLVEYTRDRLHRETVrsfgGAGSNAAYELTtaYTPAGQLQSQ 860
                         970       980       990      1000      1010      1020      1030      1040
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1152 GRGDQGVRWEYDQNSRRTcMIRPNG--QSTSYEYDANNRVSAFIQEGLG---RVVIDRDSLGRIV---SVFADGLYASWE 1223
Cdd:NF041261  861 HLNSLVYDRDYTWNDNGD-LVRISGprQTREYGYSATGRLTGVHTTAANldiRIPYATDPAGNRLpdpELHPDSTLTAWP 939
                        1050      1060      1070      1080      1090      1100      1110      1120
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1224 yadgavvRQRVERNgfiSESVITRDENGRvVADKTDGI-----------TTFYSYDQTGQLV---RAQTSEGLVTT-WEY 1288
Cdd:NF041261  940 -------DNRIAED---AHYVYRYDEYGR-LTEKTDRIpegvirtdderTHHYHYDSQHRLVfytRIQHGEPLVESrYLY 1008
                        1130      1140      1150      1160      1170      1180      1190      1200
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1289 DANGRMVvedaAGTVSRFTYDAASQLvSVTNPDGTTTYAYdDAGRRVREQGPAGERRFTWDPRGFLDSITQINHDGDKVA 1368
Cdd:NF041261 1009 DPLGRRM----AKRVWRRERDLTGWM-SLSRKPEVTWYGW-DGDRLTTVQTDTTRIQTVYQPGSFTPLIRVETENGERAK 1082
                        1210      1220      1230      1240      1250      1260      1270      1280
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1369 AQtQRLWVDALGELARVDQDSVwwdssSFMPTLVQYGARsvVTEAGVTGMVDGPDASWL---------------PPQWSA 1433
Cdd:NF041261 1083 AQ-RRSLAETLQQEGSENGHGV-----VFPAELVRMLDR--LEEEIRADRVSEESRAWLaqcgltveqmarqvePEYTPA 1154
                        1290      1300      1310      1320      1330      1340      1350      1360
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1434 R-------DHGGNPMDPWAVATNVGVSAQ----GSLLVQ---------------------GMEWMQARVYDPATRGFLST 1481
Cdd:NF041261 1155 RklhlyhcDHRGLPLALISEEGNTAWQGEydewGNLLNEenphhlqqpyrlpgqqydeesGLYYNRNRYYDPLQGRYITQ 1234
                        1370      1380      1390
                  ....*....|....*....|....*....|..
gi 550741602 1482 DPLpGVAGaGWsgNPYAFAGNdPVNFSDPLGL 1513
Cdd:NF041261 1235 DPI-GLKG-GW--NLYQYPLN-PIRFIDPLGL 1261
RhsA COG3209
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ...
510-1541 9.59e-21

Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];


Pssm-ID: 442442 [Multi-domain]  Cd Length: 1103  Bit Score: 100.22  E-value: 9.59e-21
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  510 TYDGQGRVVTQRSPHGRLSRYSYLGGRVTVVCDEDGARANTYVHDGKGRLVGVVDAEDHRQSASWDRFGNQVMVTDREGR 589
Cdd:COG3209    14 SSTLLAATNAGGGTAVTNAGSTVLLAKGGLSTAAAAGGAATLTARSASTTDVVGTLTGAGGTSAGGVTALGDASAAGGGY 93
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  590 TVVRSFDTRGHVIAEQTVSGARIEQDFDEQDRLIEVRVINDDQVSVTEMVYQGDNRNPARIVDPVGGVTVLEWDGSLLTK 669
Cdd:COG3209    94 VGGAAAGGGATLTGLAAATASAGRLVSTGAGAGGTVTAATGGTLGATAGSATTGSTDGGRGGVAVTGLAGGGASAYGLTL 173
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  670 VVDPTGVTLTMGYDAHGDLVSTTNAA----GHTARLERDEQGRVVAAITPLGHTTRYRYDERGMCVERIDPDGAVWRFEY 745
Cdd:COG3209   174 GGAAAGPATGVGTGAVTLATGLAGSAllalGSGAILGGLAGAYSGSATTATGTALGTPASVAATVTGSATGAAGAGAAVA 253
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  746 SAGGRLLATVDPLGGRVEVERDEAGEESATIDELGRRVERRVDDLGNLTRVELPDGSAWQFSYDAMSRMQSMKDASGGVW 825
Cdd:COG3209   254 TAATTLGGTTGAGTGASGAGLDASTGTGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAG 333
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  826 QFSYDAEGTLKATTDATGGVRRYETNQMALPTAYLDDGKREEAVYDRLGRMVCHINADATSTKTRYDLAGLPVEIIDEAG 905
Cdd:COG3209   334 TTTTTGTGTGGTTTTVGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTT 413
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  906 GVTSIERDLAGRPVAVTQPMGQTFRYEYDECGRWAATISTGGDRYEMIYDADGRIEGEVWPTGERVTTTFDECGRAVARR 985
Cdd:COG3209   414 GGDGGPATAAGALTAGGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGA 493
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  986 EPGRGLTRLKYDKLGRVVWSQDSWYGTRRFRYDAAGQMSEVVNAAGGVTHFAYDELGRIGEVTDPMGGVTRHTYDPMGRL 1065
Cdd:COG3209   494 TTLGTDTTLDDTLGGTTTTTAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGD 573
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1066 LTSTDPLGRVTRYSYDAAGRVTRRVDGAGASMSWVYDSAGRRVEEWVNDTLLAATTRDFVGRTTTIVEGDARHELRFDVR 1145
Cdd:COG3209   574 GTGGASTTTGTTGGTATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGTGVTTTG 653
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1146 GNLLWRGRGDQGVRWEYDQNSRRTCMIRPNGQSTSYEYDANNRVSAFIQEGLGRVVIDRDSLGRIVSVFADGLYASWEYA 1225
Cdd:COG3209   654 TTTTRATGTTGTGTGVTAGLTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTD 733
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1226 DGAVVRQRVERNG-------FISESVITRDENGRVVADKT------DGITTFYSYDQTGQLVRAQTSEGLVTTWEYDANG 1292
Cdd:COG3209   734 GTGTGGTTGTLTTtstttttTAGALTYTYDALGRLTSETTpggvtqGTYTTRYTYDALGRLTSVTYPDGETVTYTYDALG 813
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1293 RMV----VEDAAGTVS---RFTYDAASQLVSVTN----PDGTTTYAYDDAGRRVREQGPAGERRFTWDPRGFLDSitqin 1361
Cdd:COG3209   814 RLTsvitVGSGGGTDLqdrTYTYDAAGNITSITDalraGTLTQTYTYDALGRLTSATDPGTTESYTYDANGNLTS----- 888
                         890       900       910       920       930       940       950       960
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1362 hdgdKVAAQTQRLWVDALGELARVDQDSVwwDSSSFmptlvQYGARSVVTEAG-VTGMVDGPDAswlpPQWSARdhggnp 1440
Cdd:COG3209   889 ----RTDGGTTTYTYDALGRLVSVTKPDG--TTTTY-----TYDALGHTDHLGsVRALTDASGQ----VVWRYD------ 947
                         970       980       990      1000      1010      1020      1030      1040
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1441 MDPW-AVATNVGVSAQGSLLVQGMEW--------MQARVYDPATRGFLSTDPLPGVAGAgwsgNPYAFAGNDPVNFSDPL 1511
Cdd:COG3209   948 YDPFgNLLAETSGAAANPLRFTGQEYdaetglyyNGARYYDPALGRFLSPDPIGLAGGL----NLYAYVGNNPVNYVDPL 1023
                        1050      1060      1070
                  ....*....|....*....|....*....|
gi 550741602 1512 GLRPLTDEDLKGYRDAARSPLAKAADAAGG 1541
Cdd:COG3209  1024 GLAALLGTTGLGGGAGVGAGAAGGGAAAAG 1053
DUF6531 pfam20148
Domain of unknown function (DUF6531); This putative domain is found in a range of RHS proteins.
294-366 1.87e-20

Domain of unknown function (DUF6531); This putative domain is found in a range of RHS proteins.


Pssm-ID: 466309 [Multi-domain]  Cd Length: 74  Bit Score: 86.82  E-value: 1.87e-20
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 550741602   294 DPVNAATGNFVEPQVDLGFtGGSATLEWGRVYNSMSQVCGAFGPGWSSLAESRLVV-TDESARWVQADGRHIVF 366
Cdd:pfam20148    2 DPVNVATGNKVLEETDFSL-PGPLPLVWTRTYNSSSERDGPLGPGWSHPYDQRLELeGDGGVVYIDADGREVTF 74
PT-HINT pfam07591
Pretoxin HINT domain; A member of the HINT superfamily of proteases that is usually found ...
1700-1828 1.00e-18

Pretoxin HINT domain; A member of the HINT superfamily of proteases that is usually found N-terminal to the toxin module in polymorphic toxin systems. The domain is predicted to function in releasing the toxin domain by autoproteolysis.


Pssm-ID: 400120 [Multi-domain]  Cd Length: 136  Bit Score: 84.23  E-value: 1.00e-18
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  1700 VLMADGTsKNIEDVVAGDMVAAYNPETGQAEPGEVTDTYIHDQVATWQVTTESGTVTTTAV-----HPFYVDGkGWVEAQ 1774
Cdd:pfam07591    2 VLTADGY-KAIANIKAGDRVIAKDEASGKTGYKPVTATYGNPYQETVYITISDGIGNSQTLisnfiHPFYSKG-KWIEAG 79
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 550741602  1775 DLRAGDQLYDETGSLVTVVSIQATGQIATVYNVTVKDLHNYYVADD---SSWVLAHN 1828
Cdd:pfam07591   80 DLKVGDKLLDESGNVQTVENIKLKDKPLKAYNLTVADWHTYFVKGNqaeTEGVWVHN 136
RhsA COG3209
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ...
265-1187 1.21e-17

Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];


Pssm-ID: 442442 [Multi-domain]  Cd Length: 1103  Bit Score: 89.82  E-value: 1.21e-17
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  265 ASGISSSRDDIVVDPPTAMGGQPTSGFADDPVNAATGNFVEPQVDLGFTGGSATLEWGRVYNSMSQVCGAFGPGWSSLAE 344
Cdd:COG3209     1 ETSLGLVGGTTGASSTLLAATNAGGGTAVTNAGSTVLLAKGGLSTAAAAGGAATLTARSASTTDVVGTLTGAGGTSAGGV 80
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  345 SRLVVTDESARWVQADGRHIVFGRM-GEGWGRAQCENLWLEQPTGMGGVAFLIRDNDGGSFAFSQAGRPVFQDRGPGSRV 423
Cdd:COG3209    81 TALGDASAAGGGYVGGAAAGGGATLtGLAAATASAGRLVSTGAGAGGTVTAATGGTLGATAGSATTGSTDGGRGGVAVTG 160
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  424 AFTYSEDRLVRLEHEFGRAVELVWSQDGRRVVEALASDGRRVGYVYDEQERLVESTGDTATHRYEWNEQGLM--CRVVDA 501
Cdd:COG3209   161 LAGGGASAYGLTLGGAAAGPATGVGTGAVTLATGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPAsvAATVTG 240
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  502 DGVVEVDNTYDGQGRVVTQRSPHGRLSRYSYLGGRVTVVCDEDGARANTYVHDGKGRLVGVVDAEDHRQSASWDRFGNQV 581
Cdd:COG3209   241 SATGAAGAGAAVATAATTLGGTTGAGTGASGAGLDASTGTGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGT 320
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  582 MVTDREGRTVVRSFDT---RGHVIAEQTVSGARIEQDFDEQDRLIEVRVINDDQVSVTEMVYQGDNRNPARIVDPVGGVT 658
Cdd:COG3209   321 TGTAAVSGAADAGTTTttgTGTGGTTTTVGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSST 400
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  659 VLEWDGSLLTKVVDPTGVTLTMGYDAHGDLVSTTNAAGHTARLERDEQGRVVAAITPLGHTTRYRYDERGMCVERIDPDG 738
Cdd:COG3209   401 TGVGAGTTTTSTTGGDGGPATAAGALTAGGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAG 480
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  739 AVWRFEYSAGGR---LLATVDPLGGRVEVERDEAGEESATIDELGRRVERRVDDLGNLTRVELPDGSAWQFSYDAMSRMQ 815
Cdd:COG3209   481 TGGGTLTSGSAGattLGTDTTLDDTLGGTTTTTAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTST 560
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  816 SMKDASGGVWQFSYDAEGTLKATTDATGGVRRYETNQMALPTAYLDDGKREEAVYDRLGRMVCHINADATSTKTRYDLAG 895
Cdd:COG3209   561 GTGGTGTVTTTGDGTGGASTTTGTTGGTATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGST 640
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  896 LPVEIIDEAGGVTSIERDLAGRPVAVTQPMGQTFRYEYDECGRWAATISTGGDRYEMIYDADGRIEGEVWPTGERVTTTF 975
Cdd:COG3209   641 TGGTTGTGVTTTGTTTTRATGTTGTGTGVTAGLTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLG 720
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  976 DECGRAVARREPGRGLTRLKYDKLGRVVWSQDSWYGTRRFRYDAAGQMSEVVNAAGGVTHfaydelgrigevtdpmGGVT 1055
Cdd:COG3209   721 TTTTGGGGGTTTDGTGTGGTTGTLTTTSTTTTTTAGALTYTYDALGRLTSETTPGGVTQG----------------TYTT 784
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1056 RHTYDPMGRLLTSTDPLGRVTRYSYDAAGRVTR------RVDGAGASMSWVYDSAGRRVEEWVNDTLLAATTR---DFVG 1126
Cdd:COG3209   785 RYTYDALGRLTSVTYPDGETVTYTYDALGRLTSvitvgsGGGTDLQDRTYTYDAAGNITSITDALRAGTLTQTytyDALG 864
                         890       900       910       920       930       940
                  ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 550741602 1127 RTTTIVEGDARHELRFDVRGNLLwRGRGDQGVRWEYDQNSRRTCMIRPNGQSTSYEYDANN 1187
Cdd:COG3209   865 RLTSATDPGTTESYTYDANGNLT-SRTDGGTTTYTYDALGRLVSVTKPDGTTTTYTYDALG 924
Hint cd00081
Hedgehog/Intein domain, found in Hedgehog proteins as well as proteins which contain inteins ...
1693-1828 2.41e-14

Hedgehog/Intein domain, found in Hedgehog proteins as well as proteins which contain inteins and undergo protein splicing (e.g. DnaB, RIR1-2, GyrA and Pol). In protein splicing an intervening polypeptide sequence - the intein - is excised from a protein, and the flanking polypeptide sequences - the exteins - are joined by a peptide bond. In addition to the autocatalytic splicing domain, many inteins contain an inserted endonuclease domain, which plays a role in spreading inteins. Hedgehog proteins are a major class of intercellular signaling molecules, which control inductive interactions during animal development. The mature signaling forms of hedgehog proteins are the N-terminal fragments, which are covalently linked to cholesterol at their C-termini. This modification is the result of an autoprocessing step catalyzed by the C-terminal fragments, which are aligned here.


Pssm-ID: 238035 [Multi-domain]  Cd Length: 136  Bit Score: 71.92  E-value: 2.41e-14
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1693 CFVAGTQVLMADGTSKNIEDVVA--GDMVAAYNpETGQAEPGEVtdTYIHDQVATWQVTTESGTVTTT----AVHPFYV- 1765
Cdd:cd00081     1 CFTGDTLVLLEDGGRKKIEELVEkkGDKVLALD-ETGKLVFSKV--LKVLRRDYEKKFYKIKTESGREitltPDHLLFVl 77
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 550741602 1766 --DGKGWVEAQDLRAGDQLYdeTGSLVTVVSIQATGQIATVYNVTVKDLHNYYVADdsswVLAHN 1828
Cdd:cd00081    78 edGELKWVFASDLKPGDYVL--VPVLEKVKEIEEIEYTGGVYDLTVEDNHNFIANG----VLVHN 136
Rhs_assc_core TIGR03696
RHS repeat-associated core domain; This model represents a conserved unique core sequence ...
1442-1513 2.74e-12

RHS repeat-associated core domain; This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain.


Pssm-ID: 274730 [Multi-domain]  Cd Length: 77  Bit Score: 63.67  E-value: 2.74e-12
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  1442 DPWAVATNVGVSAQGSLLVQGMEW--------MQARVYDPATRGFLSTDPLpGVAGaGWsgNPYAFAGNDPVNFSDPLGL 1513
Cdd:TIGR03696    2 DPYGEVLSESGAAPNPLRFTGQYYdaetglyyNGARYYDPELGRFLSPDPI-GLGG-GL--NLYAYVGNNPVNWVDPLGL 77
RhsA COG3209
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ...
198-1095 4.32e-10

Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];


Pssm-ID: 442442 [Multi-domain]  Cd Length: 1103  Bit Score: 65.16  E-value: 4.32e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  198 VDAALAAFARSCSWVTFDASSVAAALTAWNKANVDEITWADRVRELFVNAGGTGMAVPNSALDAALQASGISSSRDDIVV 277
Cdd:COG3209   155 GVAVTGLAGGGASAYGLTLGGAAAGPATGVGTGAVTLATGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPASV 234
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  278 DPPTAMGGQPTSGFADDPVNAATGNFVEPQVDLGFTGGSATLEWGRVYNSMSQVCGAFGPGWSSLAESRLVVTDESARWV 357
Cdd:COG3209   235 AATVTGSATGAAGAGAAVATAATTLGGTTGAGTGASGAGLDASTGTGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGT 314
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  358 QADGRHIVFGRMGEGWGRAQCENLWLEQPTGMGGVAFLIRDNDGGSFAFSQAGRPVFQDRGPGSRVAFTYSEDRLVRLEH 437
Cdd:COG3209   315 TTAAGTTGTAAVSGAADAGTTTTTGTGTGGTTTTVGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGS 394
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  438 EFGRAVELVWSQDGRRVVEALASDGRRVGYVYDEQERLVESTGDTATHRYEWNEQGLMCRVVDADGVVEVDNTYDGQGRV 517
Cdd:COG3209   395 GGGSSTTGVGAGTTTTSTTGGDGGPATAAGALTAGGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTG 474
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  518 VTQRSPHGRLSRYSYLGGRVTVVCDEDGARANTYVHDGKGRLVGVVDAEDHRQSASWDRFGNQVMVTDREGRTVVRSFDT 597
Cdd:COG3209   475 GGTEAGTGGGTLTSGSAGATTLGTDTTLDDTLGGTTTTTAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTV 554
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  598 RGHVIAEQTVSGARIEQDFDEQDRLIEVRVINDDQVSVTEMVYQGDNRNPARIVDPVGGVTVLEWDGSLLTKVVDPTGVT 677
Cdd:COG3209   555 GTGTSTGTGGTGTVTTTGDGTGGASTTTGTTGGTATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERAT 634
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  678 LTMGYDAHGDLVSTTNAAGHTARLERDEQGRVVAAITPLGHTTRYRYDERGMCVERIDPDGAVWRFEYSAGGRLLATVDP 757
Cdd:COG3209   635 ASTGSTTGGTTGTGVTTTGTTTTRATGTTGTGTGVTAGLTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGG 714
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  758 LGGRVEVERDEAGEESATIDELGRRVERRVDDLGNLTRVELPDGSawqFSYDAMSRMQSMKDASG-----GVWQFSYDAE 832
Cdd:COG3209   715 TTTRLGTTTTGGGGGTTTDGTGTGGTTGTLTTTSTTTTTTAGALT---YTYDALGRLTSETTPGGvtqgtYTTRYTYDAL 791
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  833 GTLKATTDATGGVRRYEtnqmalptaylddgkreeavYDRLGRMVCHINADATSTKTRYDLaglpveiideaggvtsier 912
Cdd:COG3209   792 GRLTSVTYPDGETVTYT--------------------YDALGRLTSVITVGSGGGTDLQDR------------------- 832
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  913 dlagrpvavtqpmgqtfRYEYDECGRWAATIST---GGDRYEMIYDADGRIEGEVWPTGERVTTtfdecgravarrepgr 989
Cdd:COG3209   833 -----------------TYTYDAAGNITSITDAlraGTLTQTYTYDALGRLTSATDPGTTESYT---------------- 879
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  990 gltrlkYDKLGRVVwsQDSWYGTRRFRYDAAGQMSEVVNAAGGVTHFAYDELGrigevtdpmggvtrhTYDPMGRLLTST 1069
Cdd:COG3209   880 ------YDANGNLT--SRTDGGTTTYTYDALGRLVSVTKPDGTTTTYTYDALG---------------HTDHLGSVRALT 936
                         890       900
                  ....*....|....*....|....*..
gi 550741602 1070 DPLGRVT-RYSYDAAGRVTRRVDGAGA 1095
Cdd:COG3209   937 DASGQVVwRYDYDPFGNLLAETSGAAA 963
Tox-REase-5 pfam15648
Restriction endonuclease fold toxin 5; A predicted toxin of the restriction endonuclease fold ...
1853-1925 7.02e-09

Restriction endonuclease fold toxin 5; A predicted toxin of the restriction endonuclease fold present in bacterial polymorphic toxin systems. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5, type 6, or PrsW-peptidase dependent secretion system. Versions of this domain are also found in caudoviruses.


Pssm-ID: 464785  Cd Length: 95  Bit Score: 54.73  E-value: 7.02e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602  1853 ARAYELQITGRPEGYYVN---------GVEFDGFQNGE--LLDAKGlGYANLI------PAEWSTAAKQLEDAADRQLEA 1915
Cdd:pfam15648    1 ARAYQARITGFPYGPEYRvkieewlwlGVDFDGFDPAEclLLEAKA-GYDQFFdpklpkPKKFFKGADKLLEQAERQNRA 79
                           90
                   ....*....|....
gi 550741602  1916 --AGSTP--IHWIF 1925
Cdd:pfam15648   80 arAGGPPvrLRWHF 93
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
1059-1095 1.41e-08

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 52.22  E-value: 1.41e-08
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 550741602  1059 YDPMGRLLTSTDPLGRVTRYSYDAAGRVTRRVDGAGA 1095
Cdd:pfam05593    1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDGT 37
HintN smart00306
Hint (Hedgehog/Intein) domain N-terminal region; Hedgehog/Intein domain, N-terminal region. ...
1692-1780 9.04e-07

Hint (Hedgehog/Intein) domain N-terminal region; Hedgehog/Intein domain, N-terminal region. Domain has been split to accommodate large insertions of endonucleases.


Pssm-ID: 197642 [Multi-domain]  Cd Length: 100  Bit Score: 48.81  E-value: 9.04e-07
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602   1692 GCFVAGTQVLMADGTSKNIEDVVAGDMVAAYNPETGQAEPGEVTDTYIHDQVATWQVTTESGTVTTTAV--HPFYVDGKG 1769
Cdd:smart00306    1 GCFPGDTLVLTEDGGIKKIEELEEGDKVLALDEGTLKYSPVKVFLVREPKGEKKFYRIKTENGREITLTpdHLLLVRDGG 80
                            90
                    ....*....|....
gi 550741602   1770 ---WVEAQDLRAGD 1780
Cdd:smart00306   81 klvWVFASELKPGD 94
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
1038-1074 1.89e-06

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 46.05  E-value: 1.89e-06
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 550741602  1038 YDELGRIGEVTDPMGGVTRHTYDPMGRLLTSTDPLGR 1074
Cdd:pfam05593    1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDGT 37
Hom_end_hint pfam05203
Hom_end-associated Hint; Homing endonucleases are encoded by mobile DNA elements that are ...
1693-1719 2.72e-06

Hom_end-associated Hint; Homing endonucleases are encoded by mobile DNA elements that are found inserted within host genes in all domains of life. The crystal structure of the homing nuclease PI-Sce revealed two domains: an endonucleolytic centre resembling the C-terminal domain of Drosophila melanogaster Hedgehog protein, and a second domain containing the protein-splicing active site. This Domain corresponds to the latter protein-splicing domain.


Pssm-ID: 368334 [Multi-domain]  Cd Length: 444  Bit Score: 52.05  E-value: 2.72e-06
                           10        20
                   ....*....|....*....|....*..
gi 550741602  1693 CFVAGTQVLMADGTSKNIEDVVAGDMV 1719
Cdd:pfam05203    1 CFAKGTEVLMADGSIKSIEDIEVGDKV 27
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
1288-1323 4.88e-06

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 44.90  E-value: 4.88e-06
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 550741602  1288 YDANGRMV-VEDAAGTVSRFTYDAASQLVSVTNPDGT 1323
Cdd:pfam05593    1 YDAAGRLTsVTDPDGRVTTYTYDAAGRLTAVTDPDGT 37
YD_repeat_2x TIGR01643
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ...
1059-1099 5.49e-06

YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.


Pssm-ID: 273728 [Multi-domain]  Cd Length: 42  Bit Score: 44.89  E-value: 5.49e-06
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|.
gi 550741602  1059 YDPMGRLLTSTDPLGRVTRYSYDAAGRVTRRVDGAGASMSW 1099
Cdd:TIGR01643    1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRY 41
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
1308-1343 1.11e-05

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 43.74  E-value: 1.11e-05
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 550741602  1308 YDAASQLVSVTNPDG-TTTYAYDDAGRRVREQGPAGE 1343
Cdd:pfam05593    1 YDAAGRLTSVTDPDGrVTTYTYDAAGRLTAVTDPDGT 37
YD_repeat_2x TIGR01643
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ...
682-722 4.11e-05

YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.


Pssm-ID: 273728 [Multi-domain]  Cd Length: 42  Bit Score: 42.58  E-value: 4.11e-05
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|.
gi 550741602   682 YDAHGDLVSTTNAAGHTARLERDEQGRVVAAITPLGHTTRY 722
Cdd:TIGR01643    1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRY 41
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
704-739 4.49e-05

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 42.20  E-value: 4.49e-05
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 550741602   704 DEQGRVVAAITPLGHTTRYRYDERGMCVERIDPDGA 739
Cdd:pfam05593    2 DAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDGT 37
YD_repeat_2x TIGR01643
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ...
704-744 6.65e-05

YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.


Pssm-ID: 273728 [Multi-domain]  Cd Length: 42  Bit Score: 41.81  E-value: 6.65e-05
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|.
gi 550741602   704 DEQGRVVAAITPLGHTTRYRYDERGMCVERIDPDGAVWRFE 744
Cdd:TIGR01643    2 DAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRYE 42
YD_repeat_2x TIGR01643
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ...
1017-1056 7.63e-05

YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.


Pssm-ID: 273728 [Multi-domain]  Cd Length: 42  Bit Score: 41.81  E-value: 7.63e-05
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|
gi 550741602  1017 YDAAGQMSEVVNAAGGVTHFAYDELGRIGEVTDPMGGVTR 1056
Cdd:TIGR01643    1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTR 40
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
724-759 8.58e-05

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 41.43  E-value: 8.58e-05
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 550741602   724 YDERGMCVERIDPDGAVWRFEYSAGGRLLATVDPLG 759
Cdd:pfam05593    1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDG 36
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
1017-1050 8.92e-05

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 41.43  E-value: 8.92e-05
                           10        20        30
                   ....*....|....*....|....*....|....
gi 550741602  1017 YDAAGQMSEVVNAAGGVTHFAYDELGRIGEVTDP 1050
Cdd:pfam05593    1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDP 34
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
682-717 9.84e-05

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 41.05  E-value: 9.84e-05
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 550741602   682 YDAHGDLVSTTNAAGHTARLERDEQGRVVAAITPLG 717
Cdd:pfam05593    1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDG 36
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
788-822 1.02e-04

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 41.05  E-value: 1.02e-04
                           10        20        30
                   ....*....|....*....|....*....|....*
gi 550741602   788 DDLGNLTRVELPDGSAWQFSYDAMSRMQSMKDASG 822
Cdd:pfam05593    2 DAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDG 36
YD_repeat_2x TIGR01643
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ...
1288-1326 1.64e-04

YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.


Pssm-ID: 273728 [Multi-domain]  Cd Length: 42  Bit Score: 40.65  E-value: 1.64e-04
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|
gi 550741602  1288 YDANGRMV-VEDAAGTVSRFTYDAASQLVSVTNPDGTTTY 1326
Cdd:TIGR01643    1 YDAAGRLTgSTDADGTTTRYTYDAAGRLVEITDADGGSTR 40
Hop COG1372
Intein/homing endonuclease [Replication, recombination and repair, Mobilome: prophages, ...
1687-1780 2.21e-04

Intein/homing endonuclease [Replication, recombination and repair, Mobilome: prophages, transposons];


Pssm-ID: 440983 [Multi-domain]  Cd Length: 866  Bit Score: 46.43  E-value: 2.21e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1687 DVSRHGCFVAGTQVLMADGTSKNIEDVV---AGDMVAAYNPETGQAEPGEVT---DTYIHDQV-----------ATwqvt 1749
Cdd:COG1372    92 DTGTGVCLTGDTLVLTADGRLVPIGELVgsgEDVEVLSLDLDTGKLVWAPVTkvfKTGVKPVYrirtrsgreirAT---- 167
                          90       100       110
                  ....*....|....*....|....*....|.
gi 550741602 1750 tesgtvtttAVHPFYVDgKGWVEAQDLRAGD 1780
Cdd:COG1372   168 ---------PDHPFLTL-SGWKEAGELKPGD 188
Bacuni_01323_like cd12871
Uncharacterized protein conserved in Bacteroidetes; A well-conserved family of 16-stranded ...
1174-1330 5.21e-04

Uncharacterized protein conserved in Bacteroidetes; A well-conserved family of 16-stranded beta barrels resembling outer membrane porins. The interior of the barrels is mostly occupied by an insert with partially helical structure.


Pssm-ID: 214015 [Multi-domain]  Cd Length: 231  Bit Score: 43.56  E-value: 5.21e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1174 PNGQSTSYEYDANNRVSAFIQeglgrvvIDRDSLGRIVSVfadglyASWEYADGAVVrqrVERNGFISESVITRDENGRV 1253
Cdd:cd12871    14 KSTSEYTFEYDADGRLTSITT-------TQEGEAEEITYT------TTITYEPNVIT---VTDDGGKTVSTYTLNEKGYV 77
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602 1254 VA------DKTDGITTFYSYDQTGQLVRAQTSEGL-VTTWEYD-ANGRMV----VEDAAGTVSRFTYDAASQLVSVTNP- 1320
Cdd:cd12871    78 TScteteyGKGQLRTYTFTYNADGQLTKIVESIGTeYSTITITwNNGDIVsistKSNTEENESKITYTSDKVYNPIVNKg 157
                         170
                  ....*....|....*
gi 550741602 1321 -----DGTTTYAYDD 1330
Cdd:cd12871   158 clmlfGLTLGYDLSD 172
YD_repeat_2x TIGR01643
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ...
1308-1345 5.29e-04

YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.


Pssm-ID: 273728 [Multi-domain]  Cd Length: 42  Bit Score: 39.11  E-value: 5.29e-04
                           10        20        30
                   ....*....|....*....|....*....|....*....
gi 550741602  1308 YDAASQLVSVTNPDGTTT-YAYDDAGRRVREQGPAGERR 1345
Cdd:TIGR01643    1 YDAAGRLTGSTDADGTTTrYTYDAAGRLVEITDADGGST 39
YD_repeat_2x TIGR01643
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ...
1038-1078 7.84e-04

YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.


Pssm-ID: 273728 [Multi-domain]  Cd Length: 42  Bit Score: 38.73  E-value: 7.84e-04
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|.
gi 550741602  1038 YDELGRIGEVTDPMGGVTRHTYDPMGRLLTSTDPLGRVTRY 1078
Cdd:TIGR01643    1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRY 41
YD_repeat_2x TIGR01643
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ...
788-827 1.10e-03

YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.


Pssm-ID: 273728 [Multi-domain]  Cd Length: 42  Bit Score: 38.34  E-value: 1.10e-03
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|
gi 550741602   788 DDLGNLTRVELPDGSAWQFSYDAMSRMQSMKDASGGVWQF 827
Cdd:TIGR01643    2 DAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRY 41
YD_repeat_2x TIGR01643
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ...
808-849 1.44e-03

YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.


Pssm-ID: 273728 [Multi-domain]  Cd Length: 42  Bit Score: 37.95  E-value: 1.44e-03
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|..
gi 550741602   808 YDAMSRMQSMKDASGGVWQFSYDAEGTLKATTDATGGVRRYE 849
Cdd:TIGR01643    1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRYE 42
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
913-948 2.43e-03

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 37.19  E-value: 2.43e-03
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 550741602   913 DLAGRPVAVTQPMGQTFRYEYDECGRWAATISTGGD 948
Cdd:pfam05593    2 DAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDGT 37
AXH smart00536
domain in Ataxins and HMG containing proteins; unknown function
1693-1770 2.64e-03

domain in Ataxins and HMG containing proteins; unknown function


Pssm-ID: 197779  Cd Length: 116  Bit Score: 39.39  E-value: 2.64e-03
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 550741602   1693 CFVAGTQVLMADGTSKNIEDVVAGDMVAAY---NPETGQAE----------PGEVTDTY---IHDQVATWQVTTEsgtvt 1756
Cdd:smart00536    1 CFMKGTRLCLANGSNKKVEDLKTEDFIRSAgcsNDEDLQMStvkrigssglPSVVTLTFdpgVEDALLTVECQVE----- 75
                            90
                    ....*....|....
gi 550741602   1757 ttavHPFYVDGKGW 1770
Cdd:smart00536   76 ----HPFFVKGKGW 85
YD_repeat_2x TIGR01643
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ...
724-765 3.37e-03

YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.


Pssm-ID: 273728 [Multi-domain]  Cd Length: 42  Bit Score: 37.18  E-value: 3.37e-03
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|..
gi 550741602   724 YDERGMCVERIDPDGAVWRFEYSAGGRLLATVDPLGGRVEVE 765
Cdd:TIGR01643    1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRYE 42
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
1267-1302 3.56e-03

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 36.81  E-value: 3.56e-03
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 550741602  1267 YDQTGQLVRAQTSEGLVTTWEYDANGRMVVE-DAAGT 1302
Cdd:pfam05593    1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAVtDPDGT 37
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
1080-1110 5.89e-03

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 36.04  E-value: 5.89e-03
                           10        20        30
                   ....*....|....*....|....*....|.
gi 550741602  1080 YDAAGRVTRRVDGAGASMSWVYDSAGRRVEE 1110
Cdd:pfam05593    1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAV 31
YD_repeat_2x TIGR01643
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ...
1080-1110 9.39e-03

YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.


Pssm-ID: 273728 [Multi-domain]  Cd Length: 42  Bit Score: 35.64  E-value: 9.39e-03
                           10        20        30
                   ....*....|....*....|....*....|.
gi 550741602  1080 YDAAGRVTRRVDGAGASMSWVYDSAGRRVEE 1110
Cdd:TIGR01643    1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEI 31
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH