|
Name |
Accession |
Description |
Interval |
E-value |
| Ten_N |
pfam06484 |
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of ... |
11-307 |
1.16e-170 |
|
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of the Teneurin family of proteins. These proteins are 'pair-rule' genes and are involved in tissue patterning, specifically probably neural patterning. The intracellular domain is cleaved in response to homophilic interaction of the extracellular domain, and translocates to the nucleus. Here it probably carries out to some transcriptional regulatory activity. The length of this region and the conservation suggests that there may be two structural domains here (personal obs:C Yeats). :
Pssm-ID: 461932 [Multi-domain] Cd Length: 367 Bit Score: 529.93 E-value: 1.16e-170
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 11 SLTKSRREKERRYTNSSADNEECRVPTQKSYSSSETLKAFDHDSsRLLYGNRVKDLVHREADEYPRQGQNFTLRQLGVCE 90
Cdd:pfam06484 1 SLTKRRRDKERRYTSSSADSEECRVPTQKSYSSSETLKAFDHDS-RMLYGNRVKDMVHKEADEFSRQGQNFSLRELGICE 79
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 91 SATRRGVAFCAEMGLPHRGYSISAGSDADTENEAVMSPEHAMRLWGRGVKSGRSSCLSSRSNSALTLTDTEHENRSDSE- 169
Cdd:pfam06484 80 PSPRHGLAYCTEMGLPHRGYSISTGSDADTETDGPMSPEHAVRLWGRGTKSGRSSCLSSRSNSALTLTDTEHENKSDNEn 159
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 170 ------------------------------------------------------------------SEQPANNQGQPTLQ 183
Cdd:pfam06484 160 gppippsssssspveqhsppppslnenqrpllgnnashpildsdpdeefspnsylvrtgsgpqsapSEQPPNFQNHSRLR 239
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 184 ----PLPPSHKQHPAHHPSVTSLNRNSLTNRRNQSPAPPAALPAELQTTPESVQLQDSWVLGSNVPLESRHFLFKTGTGT 259
Cdd:pfam06484 240 tpppPLPPPHKQNQHHHPSINSLNRSSLTNRRNPSPAPTASLPAELQSTQESVQLQDSWVLNSNVPLETRHFLFKTGTGT 319
|
330 340 350 360
....*....|....*....|....*....|....*....|....*...
gi 1958683751 260 TPLFSTATPGYTMASGSVYSPPTRPLPRNTLSRSAFKFKKSSRYCSWR 307
Cdd:pfam06484 320 TPLFCTASPGYPLTSGTVYSPPPRPLPRNTFSRPAFKLKKPYKYCSWK 367
|
|
| NHL super family |
cl18310 |
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ... |
1160-1519 |
6.16e-41 |
|
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats. The actual alignment was detected with superfamily member cd14953:
Pssm-ID: 302697 [Multi-domain] Cd Length: 323 Bit Score: 155.00 E-value: 6.16e-41
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1160 VSSIMGNGRRrsiscpscnGQADGNKLLA----PVALACGIDGSLYVGDF--NYVRRIFPSGNVTSVLELRNKDFRHSSN 1233
Cdd:cd14953 1 VSTVAGSGTA---------GFSGGGGTAArfnsPSGVAVDAAGNLYVADRgnHRIRKITPDGVVTTVAGTGTAGFADGGG 71
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1234 PAHRYY----LATDPvTGDLYVSDTNTRRIYRpksLTGAKDLTknaeVVAGTGEqclpfdeARCGDGGKAVEATLMSPKG 1309
Cdd:cd14953 72 AAAQFNtpsgVAVDA-AGNLYVADTGNHRIRK---ITPDGVVS----TLAGTGT-------AGFSDDGGATAAQFNYPTG 136
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1310 MAIDKNGLIYFVDGT--MIRKVDQNGIISTLLGSNDLTSAR--PLTcdtsmhisQVRLEWPTDLAINPMDNsIYVLD--N 1383
Cdd:cd14953 137 VAVDAAGNLYVADTGnhRIRKITPDGVVTTVAGTGGAGYAGdgPAT--------AAQFNNPTGVAVDAAGN-LYVADrgN 207
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1384 NVVLQITENRQVRIAAGRPmhcqvpGVEYPVGKHAVQTTLESATAIAVSYSGVLYITETDEkkiNRIRQVTTDGEISLVA 1463
Cdd:cd14953 208 HRIRKITPDGVVTTVAGTG------TAGFSGDGGATAAQLNNPTGVAVDAAGNLYVADSGN---HRIRKITPAGVVTTVA 278
|
330 340 350 360 370
....*....|....*....|....*....|....*....|....*....|....*..
gi 1958683751 1464 GIPSEcdckndancdcyQTGD-GYAKDAKLSAPSSLAASPDGTLYIADLGNIRIRAV 1519
Cdd:cd14953 279 GGGAG------------FSGDgGPATSAQFNNPTGVAVDAAGNLYVADTGNNRIRKI 323
|
|
| Tox-GHH |
pfam15636 |
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH ... |
2636-2713 |
7.78e-40 |
|
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteriztic sG[HQ]H signature motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, type 7 or TcdB/TcaC-type secretion system. The metazoan teneurin proteins possess an inactive of this domain at their C-terminus. :
Pssm-ID: 464783 Cd Length: 78 Bit Score: 142.75 E-value: 7.78e-40
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1958683751 2636 EEKARILEQARQRALARAWAREQQRVRDGEEGARLWTEGEKRQLLSAGKVQGYDGYYVLSVEQYPELADSANNIQFLR 2713
Cdd:pfam15636 1 EERKRLLEHAKKRAVREAWHRERQLLRNGLPGSRDWTDEEKEELLSTGSVPGYDGEYIHPVEQYPELADDPSNIRFRK 78
|
|
| RhsA |
COG3209 |
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ... |
1473-2413 |
1.05e-33 |
|
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only]; :
Pssm-ID: 442442 [Multi-domain] Cd Length: 1103 Bit Score: 142.97 E-value: 1.05e-33
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1473 NDANCDCYQTGDGYAKDAKLSAPSSLAASPDGTLYIADLGNIRIRAVSKNKPLLNSMNFYEVASPTDQELYIFDINGTHQ 1552
Cdd:COG3209 109 AAATASAGRLVSTGAGAGGTVTAATGGTLGATAGSATTGSTDGGRGGVAVTGLAGGGASAYGLTLGGAAAGPATGVGTGA 188
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1553 YTVSLVTGDYLYNFSYSNDNDVTAVTDSNGNTLRIRRDPNRMPVRVVSPDNQVIWLTIGTNGCLKSMTAQGLELVLFTYH 1632
Cdd:COG3209 189 VTLATGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPASVAATVTGSATGAAGAGAAVATAATTLGGTTGAGTG 268
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1633 GNSGLLATK---SDETGWTTFFDYDSEGRLTNVTFPTGVVTNLHGDMDKAITVDIESSSREEDVSITSNLSSIDSFYTMV 1709
Cdd:COG3209 269 ASGAGLDAStgtGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAGTTTTTGTGTGGTTTT 348
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1710 QDQLRNSYQIGYDGSLRIFYASGLDSHYQTEPHVLAGTANPTVAKRNMTLPGENGQNLVEWRFRKEQAQGKVNVFGRKLR 1789
Cdd:COG3209 349 VGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTTGGDGGPATAAGALTA 428
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1790 VNGRNLLSVDFDRTTKTEKIYDDHRKFLLRIAYDTSGHPTLWLPSSKLMAVNVTYSSTGQIASIQRGTTSEKVDYDSQGR 1869
Cdd:COG3209 429 GGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGATTLGTDTTLDDTLGG 508
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1870 IVSRVFADGKTWSYTYLEKSMVLLLHSQRQYIFEYDMWDRLSAITMPSVARHTMQTIRSIGYYRNIYNPPESNASIITDY 1949
Cdd:COG3209 509 TTTTTAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGDGTGGASTTTGTTGGT 588
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1950 NEEGLLLQTAFLGTSRRVLFKYRRQTRLSEILYDSTRVSFTYDETAGVLKTVNLQSDGFicTIRYRQIGPLIDRQIFRFS 2029
Cdd:COG3209 589 ATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGT--GVTTTGTTTTRATGTTGTG 666
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 2030 EDGMVNARFDYSYDNSFRVTSMQGVINETPLPIDLYQFDDISGKVEQFGKFGVIYYDINQIISTAVMTYTKHFDAHGRIK 2109
Cdd:COG3209 667 TGVTAGLTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTDGTGTGGTTGTLTT 746
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 2110 EIQYEifRSLMYWITIQYDNMGRVTKREIKIGPFANTTKYAYEYDVDGQLQTVYLNEKIMWRYNYDLNGNLHLLNPSSSA 2189
Cdd:COG3209 747 TSTTT--TTTAGALTYTYDALGRLTSETTPGGVTQGTYTTRYTYDALGRLTSVTYPDGETVTYTYDALGRLTSVITVGSG 824
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 2190 RLTPL-----RYDLRDRITRLgdvqyrldEDGFLRQRGTEIFEYSSKGLLTRVYSKGSGWTviYRYDGLGRRVSSKTSLG 2264
Cdd:COG3209 825 GGTDLqdrtyTYDAAGNITSI--------TDALRAGTLTQTYTYDALGRLTSATDPGTTES--YTYDANGNLTSRTDGGT 894
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 2265 QHLQFFYADLtyPTRITHvynhSSSEITSLYYDLQGHlfameissgdefyiaSDNTGTPLAVFSSNGLMLKQIQYTAYGE 2344
Cdd:COG3209 895 TTYTYDALGR--LVSVTK----PDGTTTTYTYDALGH---------------TDHLGSVRALTDASGQVVWRYDYDPFGN 953
|
890 900 910 920 930 940
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1958683751 2345 IYFDSNIDFQLVIGFHGGLYDPLTKLIHFGERDYDILAGRWTTPDieiwkRIGKDPAPfNLYMFRNNNP 2413
Cdd:COG3209 954 LLAETSGAAANPLRFTGQEYDAETGLYYNGARYYDPALGRFLSPD-----PIGLAGGL-NLYAYVGNNP 1016
|
|
| DUF5885 super family |
cl44670 |
Family of unknown function (DUF5885); This is a family of uncharacterized proteins of unknown ... |
521-674 |
1.64e-09 |
|
Family of unknown function (DUF5885); This is a family of uncharacterized proteins of unknown function found in viruses. The actual alignment was detected with superfamily member pfam19232:
Pssm-ID: 437064 Cd Length: 265 Bit Score: 61.18 E-value: 1.64e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 521 CHGNGECvsGTCH-CFPGFLGPDCSRACPVLCsGNGQ----------YSKGRC----LCFSGwkgTECDVPTTQCI-DPQ 584
Cdd:pfam19232 34 CTTDAQC--GTCMtCVAGACTPKASCCGGVTC-GAGQtcdaktntcvYVKGYCsadhPCPSG---SACDTAKNACIaQPP 107
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 585 CG---GRGiCIMG-------------------SCACNSGYK-GENCE--------EADCLDP---------------GCS 618
Cdd:pfam19232 108 YGpdsGKG-CVRGfgawiweldpatnsgvwrcRCANGSLYNsAHECSpladqtlcAAENLDPnalvpassvpafaayGWG 186
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 619 NHGVCIH-------------GECHCNPGWGGGNCEILKTmcpdqCSGHGTYLQESGSCTCD------------PN---WT 670
Cdd:pfam19232 187 NQPVLINkstagaavpsplaGVCPCKPGWAGGSCTEDRT-----CNGRGTWNETTGQCACNidfsghnscgddNNctsWT 261
|
....
gi 1958683751 671 GPDC 674
Cdd:pfam19232 262 GPRC 265
|
|
| acid_disulf_rpt |
NF033662 |
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with ... |
790-820 |
6.35e-08 |
|
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with four nearly invariant Cys residues in a repeat length of about 35 amino acids. :
Pssm-ID: 411265 [Multi-domain] Cd Length: 32 Bit Score: 50.59 E-value: 6.35e-08
10 20 30
....*....|....*....|....*....|.
gi 1958683751 790 AMETLCTDSKDNEGDGLVDCMDPDCCLQSSC 820
Cdd:NF033662 2 ATDTTCSDGIDNDGDGLTDCADPDCAGNPVC 32
|
|
| DSL super family |
cl19567 |
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain ... |
696-739 |
1.60e-05 |
|
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain defined by structure. The actual alignment was detected with superfamily member pfam01414:
Pssm-ID: 473190 Cd Length: 46 Bit Score: 44.15 E-value: 1.60e-05
10 20 30 40
....*....|....*....|....*....|....*....|....*..
gi 1958683751 696 CEEGWTGPACNqRACHPRCAE--HGTC-KDGKCECSQGWNGEHCTIA 739
Cdd:pfam01414 1 CDENYYGSTCS-KFCRPRDDKfgHYTCdANGNKVCLPGWTGPYCDKP 46
|
|
| C_rich_MXAN6577 super family |
cl49352 |
MXAN_6577-like cysteine-rich domain; |
679-781 |
1.43e-04 |
|
MXAN_6577-like cysteine-rich domain; The actual alignment was detected with superfamily member NF041328:
Pssm-ID: 469225 [Multi-domain] Cd Length: 145 Bit Score: 44.36 E-value: 1.43e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 679 CSVDCGSHGVCMGGTCRCEEGWT--GPAC--------NQRACHPRCAEHGTCKDGKCecsqgwngehctiahyldkivka 748
Cdd:NF041328 45 CGVACGAGQTCVAGACGCGPGTVacGGACvdtasdpaHCGACGAACAPGQVCEGGAC----------------------- 101
|
90 100 110 120
....*....|....*....|....*....|....*....|
gi 1958683751 749 dkigyKEGCP-GLCNSNGRCT-LDQNGWHC-----VCQPG 781
Cdd:NF041328 102 -----REACSeGLTRCGGACVdLATDPLHCgacgvACDPG 136
|
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| Ten_N |
pfam06484 |
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of ... |
11-307 |
1.16e-170 |
|
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of the Teneurin family of proteins. These proteins are 'pair-rule' genes and are involved in tissue patterning, specifically probably neural patterning. The intracellular domain is cleaved in response to homophilic interaction of the extracellular domain, and translocates to the nucleus. Here it probably carries out to some transcriptional regulatory activity. The length of this region and the conservation suggests that there may be two structural domains here (personal obs:C Yeats).
Pssm-ID: 461932 [Multi-domain] Cd Length: 367 Bit Score: 529.93 E-value: 1.16e-170
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 11 SLTKSRREKERRYTNSSADNEECRVPTQKSYSSSETLKAFDHDSsRLLYGNRVKDLVHREADEYPRQGQNFTLRQLGVCE 90
Cdd:pfam06484 1 SLTKRRRDKERRYTSSSADSEECRVPTQKSYSSSETLKAFDHDS-RMLYGNRVKDMVHKEADEFSRQGQNFSLRELGICE 79
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 91 SATRRGVAFCAEMGLPHRGYSISAGSDADTENEAVMSPEHAMRLWGRGVKSGRSSCLSSRSNSALTLTDTEHENRSDSE- 169
Cdd:pfam06484 80 PSPRHGLAYCTEMGLPHRGYSISTGSDADTETDGPMSPEHAVRLWGRGTKSGRSSCLSSRSNSALTLTDTEHENKSDNEn 159
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 170 ------------------------------------------------------------------SEQPANNQGQPTLQ 183
Cdd:pfam06484 160 gppippsssssspveqhsppppslnenqrpllgnnashpildsdpdeefspnsylvrtgsgpqsapSEQPPNFQNHSRLR 239
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 184 ----PLPPSHKQHPAHHPSVTSLNRNSLTNRRNQSPAPPAALPAELQTTPESVQLQDSWVLGSNVPLESRHFLFKTGTGT 259
Cdd:pfam06484 240 tpppPLPPPHKQNQHHHPSINSLNRSSLTNRRNPSPAPTASLPAELQSTQESVQLQDSWVLNSNVPLETRHFLFKTGTGT 319
|
330 340 350 360
....*....|....*....|....*....|....*....|....*...
gi 1958683751 260 TPLFSTATPGYTMASGSVYSPPTRPLPRNTLSRSAFKFKKSSRYCSWR 307
Cdd:pfam06484 320 TPLFCTASPGYPLTSGTVYSPPPRPLPRNTFSRPAFKLKKPYKYCSWK 367
|
|
| NHL_like_1 |
cd14953 |
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ... |
1160-1519 |
6.16e-41 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271323 [Multi-domain] Cd Length: 323 Bit Score: 155.00 E-value: 6.16e-41
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1160 VSSIMGNGRRrsiscpscnGQADGNKLLA----PVALACGIDGSLYVGDF--NYVRRIFPSGNVTSVLELRNKDFRHSSN 1233
Cdd:cd14953 1 VSTVAGSGTA---------GFSGGGGTAArfnsPSGVAVDAAGNLYVADRgnHRIRKITPDGVVTTVAGTGTAGFADGGG 71
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1234 PAHRYY----LATDPvTGDLYVSDTNTRRIYRpksLTGAKDLTknaeVVAGTGEqclpfdeARCGDGGKAVEATLMSPKG 1309
Cdd:cd14953 72 AAAQFNtpsgVAVDA-AGNLYVADTGNHRIRK---ITPDGVVS----TLAGTGT-------AGFSDDGGATAAQFNYPTG 136
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1310 MAIDKNGLIYFVDGT--MIRKVDQNGIISTLLGSNDLTSAR--PLTcdtsmhisQVRLEWPTDLAINPMDNsIYVLD--N 1383
Cdd:cd14953 137 VAVDAAGNLYVADTGnhRIRKITPDGVVTTVAGTGGAGYAGdgPAT--------AAQFNNPTGVAVDAAGN-LYVADrgN 207
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1384 NVVLQITENRQVRIAAGRPmhcqvpGVEYPVGKHAVQTTLESATAIAVSYSGVLYITETDEkkiNRIRQVTTDGEISLVA 1463
Cdd:cd14953 208 HRIRKITPDGVVTTVAGTG------TAGFSGDGGATAAQLNNPTGVAVDAAGNLYVADSGN---HRIRKITPAGVVTTVA 278
|
330 340 350 360 370
....*....|....*....|....*....|....*....|....*....|....*..
gi 1958683751 1464 GIPSEcdckndancdcyQTGD-GYAKDAKLSAPSSLAASPDGTLYIADLGNIRIRAV 1519
Cdd:cd14953 279 GGGAG------------FSGDgGPATSAQFNNPTGVAVDAAGNLYVADTGNNRIRKI 323
|
|
| Tox-GHH |
pfam15636 |
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH ... |
2636-2713 |
7.78e-40 |
|
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteriztic sG[HQ]H signature motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, type 7 or TcdB/TcaC-type secretion system. The metazoan teneurin proteins possess an inactive of this domain at their C-terminus.
Pssm-ID: 464783 Cd Length: 78 Bit Score: 142.75 E-value: 7.78e-40
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1958683751 2636 EEKARILEQARQRALARAWAREQQRVRDGEEGARLWTEGEKRQLLSAGKVQGYDGYYVLSVEQYPELADSANNIQFLR 2713
Cdd:pfam15636 1 EERKRLLEHAKKRAVREAWHRERQLLRNGLPGSRDWTDEEKEELLSTGSVPGYDGEYIHPVEQYPELADDPSNIRFRK 78
|
|
| RhsA |
COG3209 |
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ... |
1473-2413 |
1.05e-33 |
|
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];
Pssm-ID: 442442 [Multi-domain] Cd Length: 1103 Bit Score: 142.97 E-value: 1.05e-33
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1473 NDANCDCYQTGDGYAKDAKLSAPSSLAASPDGTLYIADLGNIRIRAVSKNKPLLNSMNFYEVASPTDQELYIFDINGTHQ 1552
Cdd:COG3209 109 AAATASAGRLVSTGAGAGGTVTAATGGTLGATAGSATTGSTDGGRGGVAVTGLAGGGASAYGLTLGGAAAGPATGVGTGA 188
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1553 YTVSLVTGDYLYNFSYSNDNDVTAVTDSNGNTLRIRRDPNRMPVRVVSPDNQVIWLTIGTNGCLKSMTAQGLELVLFTYH 1632
Cdd:COG3209 189 VTLATGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPASVAATVTGSATGAAGAGAAVATAATTLGGTTGAGTG 268
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1633 GNSGLLATK---SDETGWTTFFDYDSEGRLTNVTFPTGVVTNLHGDMDKAITVDIESSSREEDVSITSNLSSIDSFYTMV 1709
Cdd:COG3209 269 ASGAGLDAStgtGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAGTTTTTGTGTGGTTTT 348
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1710 QDQLRNSYQIGYDGSLRIFYASGLDSHYQTEPHVLAGTANPTVAKRNMTLPGENGQNLVEWRFRKEQAQGKVNVFGRKLR 1789
Cdd:COG3209 349 VGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTTGGDGGPATAAGALTA 428
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1790 VNGRNLLSVDFDRTTKTEKIYDDHRKFLLRIAYDTSGHPTLWLPSSKLMAVNVTYSSTGQIASIQRGTTSEKVDYDSQGR 1869
Cdd:COG3209 429 GGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGATTLGTDTTLDDTLGG 508
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1870 IVSRVFADGKTWSYTYLEKSMVLLLHSQRQYIFEYDMWDRLSAITMPSVARHTMQTIRSIGYYRNIYNPPESNASIITDY 1949
Cdd:COG3209 509 TTTTTAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGDGTGGASTTTGTTGGT 588
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1950 NEEGLLLQTAFLGTSRRVLFKYRRQTRLSEILYDSTRVSFTYDETAGVLKTVNLQSDGFicTIRYRQIGPLIDRQIFRFS 2029
Cdd:COG3209 589 ATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGT--GVTTTGTTTTRATGTTGTG 666
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 2030 EDGMVNARFDYSYDNSFRVTSMQGVINETPLPIDLYQFDDISGKVEQFGKFGVIYYDINQIISTAVMTYTKHFDAHGRIK 2109
Cdd:COG3209 667 TGVTAGLTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTDGTGTGGTTGTLTT 746
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 2110 EIQYEifRSLMYWITIQYDNMGRVTKREIKIGPFANTTKYAYEYDVDGQLQTVYLNEKIMWRYNYDLNGNLHLLNPSSSA 2189
Cdd:COG3209 747 TSTTT--TTTAGALTYTYDALGRLTSETTPGGVTQGTYTTRYTYDALGRLTSVTYPDGETVTYTYDALGRLTSVITVGSG 824
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 2190 RLTPL-----RYDLRDRITRLgdvqyrldEDGFLRQRGTEIFEYSSKGLLTRVYSKGSGWTviYRYDGLGRRVSSKTSLG 2264
Cdd:COG3209 825 GGTDLqdrtyTYDAAGNITSI--------TDALRAGTLTQTYTYDALGRLTSATDPGTTES--YTYDANGNLTSRTDGGT 894
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 2265 QHLQFFYADLtyPTRITHvynhSSSEITSLYYDLQGHlfameissgdefyiaSDNTGTPLAVFSSNGLMLKQIQYTAYGE 2344
Cdd:COG3209 895 TTYTYDALGR--LVSVTK----PDGTTTTYTYDALGH---------------TDHLGSVRALTDASGQVVWRYDYDPFGN 953
|
890 900 910 920 930 940
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1958683751 2345 IYFDSNIDFQLVIGFHGGLYDPLTKLIHFGERDYDILAGRWTTPDieiwkRIGKDPAPfNLYMFRNNNP 2413
Cdd:COG3209 954 LLAETSGAAANPLRFTGQEYDAETGLYYNGARYYDPALGRFLSPD-----PIGLAGGL-NLYAYVGNNP 1016
|
|
| DUF5885 |
pfam19232 |
Family of unknown function (DUF5885); This is a family of uncharacterized proteins of unknown ... |
521-674 |
1.64e-09 |
|
Family of unknown function (DUF5885); This is a family of uncharacterized proteins of unknown function found in viruses.
Pssm-ID: 437064 Cd Length: 265 Bit Score: 61.18 E-value: 1.64e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 521 CHGNGECvsGTCH-CFPGFLGPDCSRACPVLCsGNGQ----------YSKGRC----LCFSGwkgTECDVPTTQCI-DPQ 584
Cdd:pfam19232 34 CTTDAQC--GTCMtCVAGACTPKASCCGGVTC-GAGQtcdaktntcvYVKGYCsadhPCPSG---SACDTAKNACIaQPP 107
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 585 CG---GRGiCIMG-------------------SCACNSGYK-GENCE--------EADCLDP---------------GCS 618
Cdd:pfam19232 108 YGpdsGKG-CVRGfgawiweldpatnsgvwrcRCANGSLYNsAHECSpladqtlcAAENLDPnalvpassvpafaayGWG 186
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 619 NHGVCIH-------------GECHCNPGWGGGNCEILKTmcpdqCSGHGTYLQESGSCTCD------------PN---WT 670
Cdd:pfam19232 187 NQPVLINkstagaavpsplaGVCPCKPGWAGGSCTEDRT-----CNGRGTWNETTGQCACNidfsghnscgddNNctsWT 261
|
....
gi 1958683751 671 GPDC 674
Cdd:pfam19232 262 GPRC 265
|
|
| Vgb |
COG4257 |
Streptogramin lyase [Defense mechanisms]; |
1189-1467 |
3.08e-09 |
|
Streptogramin lyase [Defense mechanisms];
Pssm-ID: 443399 [Multi-domain] Cd Length: 270 Bit Score: 60.42 E-value: 3.08e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1189 PVALACGIDGSLYVGDF--NYVRRIFP-SGNVTsvlelrnkdfRHSSNPAHRYY-LATDPvTGDLYVSDTNTRRIYRpks 1264
Cdd:COG4257 19 PRDVAVDPDGAVWFTDQggGRIGRLDPaTGEFT----------EYPLGGGSGPHgIAVDP-DGNLWFTDNGNNRIGR--- 84
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1265 LTGAkdlTKNAEVVAGTGEQCLPFdearcgdggkaveatlmspkGMAIDKNGLIYFVDGT--MIRKVD-QNGIISTLLGS 1341
Cdd:COG4257 85 IDPK---TGEITTFALPGGGSNPH--------------------GIAFDPDGNLWFTDQGgnRIGRLDpATGEVTEFPLP 141
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1342 NDLTSARPLTCD--------------------TSMHISQVRLE----WPTDLAINPmDNSIYVLD--NNVVLQITEnrqv 1395
Cdd:COG4257 142 TGGAGPYGIAVDpdgnlwvtdfganaigridpDTGTLTEYALPtpgaGPRGLAVDP-DGNLWVADtgSGRIGRFDP---- 216
|
250 260 270 280 290 300 310
....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1958683751 1396 riAAGRpmhcqvpgveypVGKHAVQTTLESATAIAVSYSGVLYITETDekkINRIRQVTTDGEISLVAgIPS 1467
Cdd:COG4257 217 --KTGT------------VTEYPLPGGGARPYGVAVDGDGRVWFAESG---ANRIVRFDPDTELTEYV-LPS 270
|
|
| Rhs_assc_core |
TIGR03696 |
RHS repeat-associated core domain; This model represents a conserved unique core sequence ... |
2339-2413 |
4.17e-09 |
|
RHS repeat-associated core domain; This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain.
Pssm-ID: 274730 [Multi-domain] Cd Length: 77 Bit Score: 55.20 E-value: 4.17e-09
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1958683751 2339 YTAYGEIYFDSNIDFQLvIGFHGGLYDPLTKLIHFGERDYDILAGRWTTPDieiwkRIGKDpAPFNLYMFRNNNP 2413
Cdd:TIGR03696 1 YDPYGEVLSESGAAPNP-LRFTGQYYDAETGLYYNGARYYDPELGRFLSPD-----PIGLG-GGLNLYAYVGNNP 68
|
|
| acid_disulf_rpt |
NF033662 |
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with ... |
790-820 |
6.35e-08 |
|
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with four nearly invariant Cys residues in a repeat length of about 35 amino acids.
Pssm-ID: 411265 [Multi-domain] Cd Length: 32 Bit Score: 50.59 E-value: 6.35e-08
10 20 30
....*....|....*....|....*....|.
gi 1958683751 790 AMETLCTDSKDNEGDGLVDCMDPDCCLQSSC 820
Cdd:NF033662 2 ATDTTCSDGIDNDGDGLTDCADPDCAGNPVC 32
|
|
| C_rich_MXAN6577 |
NF041328 |
MXAN_6577-like cysteine-rich domain; |
571-725 |
5.20e-07 |
|
MXAN_6577-like cysteine-rich domain;
Pssm-ID: 469225 [Multi-domain] Cd Length: 145 Bit Score: 51.30 E-value: 5.20e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 571 TECDVPTTQCIDPQ--CGGRGICIMGScACNSGykgeNCEEAdcldpgCSNHGVCIHGECHCNPGwgggnceilKTMCPD 648
Cdd:NF041328 12 AGCPEPGAVCPEGLsvCGGACVDLRSD-PSNCG----ACGVA------CGAGQTCVAGACGCGPG---------TVACGG 71
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 649 QCSGHGTylqesgsctcDPNWTGPdcsneiCSVDCGSHGVCMGGTCR--CEEGWT--GPAC--------NQRACHPRCAE 716
Cdd:NF041328 72 ACVDTAS----------DPAHCGA------CGAACAPGQVCEGGACReaCSEGLTrcGGACvdlatdplHCGACGVACDP 135
|
....*....
gi 1958683751 717 HGTCKDGKC 725
Cdd:NF041328 136 GESCRGGAC 144
|
|
| PLN02919 |
PLN02919 |
haloacid dehalogenase-like hydrolase family protein |
1240-1523 |
2.51e-06 |
|
haloacid dehalogenase-like hydrolase family protein
Pssm-ID: 215497 [Multi-domain] Cd Length: 1057 Bit Score: 53.32 E-value: 2.51e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1240 LATDPVTGDLYVSDTNTRRIYrpksltgAKDLTKNAEV-VAGTGEQCL---PFDEArcgdggkaveaTLMSPKGMAID-K 1314
Cdd:PLN02919 573 LAIDLLNNRLFISDSNHNRIV-------VTDLDGNFIVqIGSTGEEGLrdgSFEDA-----------TFNRPQGLAYNaK 634
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1315 NGLIYFVD--GTMIRKVD-QNGIISTLLGS----NDLTSARPLTcdtsmhiSQVrLEWPTDLAINPMDNSIYVLDNNVvL 1387
Cdd:PLN02919 635 KNLLYVADteNHALREIDfVNETVRTLAGNgtkgSDYQGGKKGT-------SQV-LNSPWDVCFEPVNEKVYIAMAGQ-H 705
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1388 QITENRqvrIAAGRPMHCQVPGVEYPV-GKHAVQTTLESATAIAVSYS-GVLYITETDEKKInRIRQVTTDGEISLVAGI 1465
Cdd:PLN02919 706 QIWEYN---ISDGVTRVFSGDGYERNLnGSSGTSTSFAQPSGISLSPDlKELYIADSESSSI-RALDLKTGGSRLLAGGD 781
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1466 PSECD------------------------CKNDAN---CDCYQ------------------TG-----DGYAKDAKLSAP 1495
Cdd:PLN02919 782 PTFSDnlfkfgdhdgvgsevllqhplgvlCAKDGQiyvADSYNhkikkldpatkrvttlagTGkagfkDGKALKAQLSEP 861
|
330 340
....*....|....*....|....*...
gi 1958683751 1496 SSLAASPDGTLYIADLGNIRIRAVSKNK 1523
Cdd:PLN02919 862 AGLALGENGRLFVADTNNSLIRYLDLNK 889
|
|
| DSL |
pfam01414 |
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain ... |
696-739 |
1.60e-05 |
|
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain defined by structure.
Pssm-ID: 460202 Cd Length: 46 Bit Score: 44.15 E-value: 1.60e-05
10 20 30 40
....*....|....*....|....*....|....*....|....*..
gi 1958683751 696 CEEGWTGPACNqRACHPRCAE--HGTC-KDGKCECSQGWNGEHCTIA 739
Cdd:pfam01414 1 CDENYYGSTCS-KFCRPRDDKfgHYTCdANGNKVCLPGWTGPYCDKP 46
|
|
| NHL |
cd05819 |
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ... |
1488-1614 |
1.31e-04 |
|
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.
Pssm-ID: 271320 [Multi-domain] Cd Length: 269 Bit Score: 46.54 E-value: 1.31e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1488 KDAKLSAPSSLAASPDGTLYIADLGNIRIRAVSKN-KPLLN-------SMNFYE---VASPTDQELYI----------FD 1546
Cdd:cd05819 3 GPGELNNPQGIAVDSSGNIYVADTGNNRIQVFDPDgNFITSfgsfgsgDGQFNEpagVAVDSDGNLYVadtgnhriqkFD 82
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1958683751 1547 INGTHQYTVSlVTGDYLYNFSY------SNDNDVtAVTDSNGNtlRIrrdpnrmpvRVVSPDNQVIwLTIGTNG 1614
Cdd:cd05819 83 PDGNFLASFG-GSGDGDGEFNGprgiavDSSGNI-YVADTGNH--RI---------QKFDPDGEFL-TTFGSGG 142
|
|
| C_rich_MXAN6577 |
NF041328 |
MXAN_6577-like cysteine-rich domain; |
679-781 |
1.43e-04 |
|
MXAN_6577-like cysteine-rich domain;
Pssm-ID: 469225 [Multi-domain] Cd Length: 145 Bit Score: 44.36 E-value: 1.43e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 679 CSVDCGSHGVCMGGTCRCEEGWT--GPAC--------NQRACHPRCAEHGTCKDGKCecsqgwngehctiahyldkivka 748
Cdd:NF041328 45 CGVACGAGQTCVAGACGCGPGTVacGGACvdtasdpaHCGACGAACAPGQVCEGGAC----------------------- 101
|
90 100 110 120
....*....|....*....|....*....|....*....|
gi 1958683751 749 dkigyKEGCP-GLCNSNGRCT-LDQNGWHC-----VCQPG 781
Cdd:NF041328 102 -----REACSeGLTRCGGACVdLATDPLHCgacgvACDPG 136
|
|
| DSL |
pfam01414 |
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain ... |
665-707 |
1.48e-04 |
|
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain defined by structure.
Pssm-ID: 460202 Cd Length: 46 Bit Score: 41.46 E-value: 1.48e-04
10 20 30 40
....*....|....*....|....*....|....*....|....*.
gi 1958683751 665 CDPNWTGPDCSNEiCSV--DCGSHGVC-MGGTCRCEEGWTGPACNQ 707
Cdd:pfam01414 1 CDENYYGSTCSKF-CRPrdDKFGHYTCdANGNKVCLPGWTGPYCDK 45
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
611-640 |
2.48e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 37.62 E-value: 2.48e-03
10 20 30
....*....|....*....|....*....|....*
gi 1958683751 611 DCLDPG-CSNHGVCIHGE----CHCNPGWGGGNCE 640
Cdd:cd00054 4 ECASGNpCQNGGTCVNTVgsyrCSCPPGYTGRNCE 38
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
758-787 |
3.58e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 37.23 E-value: 3.58e-03
10 20 30
....*....|....*....|....*....|
gi 1958683751 758 PGLCNSNGRCTLDQNGWHCVCQPGWRGAGC 787
Cdd:cd00054 8 GNPCQNGGTCVNTVGSYRCSCPPGYTGRNC 37
|
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
1636-1667 |
8.62e-03 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 36.04 E-value: 8.62e-03
10 20 30
....*....|....*....|....*....|..
gi 1958683751 1636 GLLATKSDETGWTTFFDYDSEGRLTNVTFPTG 1667
Cdd:pfam05593 5 GRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDG 36
|
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| Ten_N |
pfam06484 |
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of ... |
11-307 |
1.16e-170 |
|
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of the Teneurin family of proteins. These proteins are 'pair-rule' genes and are involved in tissue patterning, specifically probably neural patterning. The intracellular domain is cleaved in response to homophilic interaction of the extracellular domain, and translocates to the nucleus. Here it probably carries out to some transcriptional regulatory activity. The length of this region and the conservation suggests that there may be two structural domains here (personal obs:C Yeats).
Pssm-ID: 461932 [Multi-domain] Cd Length: 367 Bit Score: 529.93 E-value: 1.16e-170
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 11 SLTKSRREKERRYTNSSADNEECRVPTQKSYSSSETLKAFDHDSsRLLYGNRVKDLVHREADEYPRQGQNFTLRQLGVCE 90
Cdd:pfam06484 1 SLTKRRRDKERRYTSSSADSEECRVPTQKSYSSSETLKAFDHDS-RMLYGNRVKDMVHKEADEFSRQGQNFSLRELGICE 79
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 91 SATRRGVAFCAEMGLPHRGYSISAGSDADTENEAVMSPEHAMRLWGRGVKSGRSSCLSSRSNSALTLTDTEHENRSDSE- 169
Cdd:pfam06484 80 PSPRHGLAYCTEMGLPHRGYSISTGSDADTETDGPMSPEHAVRLWGRGTKSGRSSCLSSRSNSALTLTDTEHENKSDNEn 159
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 170 ------------------------------------------------------------------SEQPANNQGQPTLQ 183
Cdd:pfam06484 160 gppippsssssspveqhsppppslnenqrpllgnnashpildsdpdeefspnsylvrtgsgpqsapSEQPPNFQNHSRLR 239
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 184 ----PLPPSHKQHPAHHPSVTSLNRNSLTNRRNQSPAPPAALPAELQTTPESVQLQDSWVLGSNVPLESRHFLFKTGTGT 259
Cdd:pfam06484 240 tpppPLPPPHKQNQHHHPSINSLNRSSLTNRRNPSPAPTASLPAELQSTQESVQLQDSWVLNSNVPLETRHFLFKTGTGT 319
|
330 340 350 360
....*....|....*....|....*....|....*....|....*...
gi 1958683751 260 TPLFSTATPGYTMASGSVYSPPTRPLPRNTLSRSAFKFKKSSRYCSWR 307
Cdd:pfam06484 320 TPLFCTASPGYPLTSGTVYSPPPRPLPRNTFSRPAFKLKKPYKYCSWK 367
|
|
| NHL_like_1 |
cd14953 |
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ... |
1160-1519 |
6.16e-41 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271323 [Multi-domain] Cd Length: 323 Bit Score: 155.00 E-value: 6.16e-41
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1160 VSSIMGNGRRrsiscpscnGQADGNKLLA----PVALACGIDGSLYVGDF--NYVRRIFPSGNVTSVLELRNKDFRHSSN 1233
Cdd:cd14953 1 VSTVAGSGTA---------GFSGGGGTAArfnsPSGVAVDAAGNLYVADRgnHRIRKITPDGVVTTVAGTGTAGFADGGG 71
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1234 PAHRYY----LATDPvTGDLYVSDTNTRRIYRpksLTGAKDLTknaeVVAGTGEqclpfdeARCGDGGKAVEATLMSPKG 1309
Cdd:cd14953 72 AAAQFNtpsgVAVDA-AGNLYVADTGNHRIRK---ITPDGVVS----TLAGTGT-------AGFSDDGGATAAQFNYPTG 136
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1310 MAIDKNGLIYFVDGT--MIRKVDQNGIISTLLGSNDLTSAR--PLTcdtsmhisQVRLEWPTDLAINPMDNsIYVLD--N 1383
Cdd:cd14953 137 VAVDAAGNLYVADTGnhRIRKITPDGVVTTVAGTGGAGYAGdgPAT--------AAQFNNPTGVAVDAAGN-LYVADrgN 207
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1384 NVVLQITENRQVRIAAGRPmhcqvpGVEYPVGKHAVQTTLESATAIAVSYSGVLYITETDEkkiNRIRQVTTDGEISLVA 1463
Cdd:cd14953 208 HRIRKITPDGVVTTVAGTG------TAGFSGDGGATAAQLNNPTGVAVDAAGNLYVADSGN---HRIRKITPAGVVTTVA 278
|
330 340 350 360 370
....*....|....*....|....*....|....*....|....*....|....*..
gi 1958683751 1464 GIPSEcdckndancdcyQTGD-GYAKDAKLSAPSSLAASPDGTLYIADLGNIRIRAV 1519
Cdd:cd14953 279 GGGAG------------FSGDgGPATSAQFNNPTGVAVDAAGNLYVADTGNNRIRKI 323
|
|
| Tox-GHH |
pfam15636 |
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH ... |
2636-2713 |
7.78e-40 |
|
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteriztic sG[HQ]H signature motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, type 7 or TcdB/TcaC-type secretion system. The metazoan teneurin proteins possess an inactive of this domain at their C-terminus.
Pssm-ID: 464783 Cd Length: 78 Bit Score: 142.75 E-value: 7.78e-40
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1958683751 2636 EEKARILEQARQRALARAWAREQQRVRDGEEGARLWTEGEKRQLLSAGKVQGYDGYYVLSVEQYPELADSANNIQFLR 2713
Cdd:pfam15636 1 EERKRLLEHAKKRAVREAWHRERQLLRNGLPGSRDWTDEEKEELLSTGSVPGYDGEYIHPVEQYPELADDPSNIRFRK 78
|
|
| NHL_like_1 |
cd14953 |
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ... |
1240-1520 |
7.30e-38 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271323 [Multi-domain] Cd Length: 323 Bit Score: 146.14 E-value: 7.30e-38
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1240 LATDPvTGDLYVSDTNTRRIYRpksltgakdLTKNAEV--VAGTGEqclpfdEARCGDGGKAveATLMSPKGMAIDKNGL 1317
Cdd:cd14953 28 VAVDA-AGNLYVADRGNHRIRK---------ITPDGVVttVAGTGT------AGFADGGGAA--AQFNTPSGVAVDAAGN 89
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1318 IYFVDGT--MIRKVDQNGIISTLLGsndlTSARPLTCDTSMhiSQVRLEWPTDLAINPMDNsIYVLD--NNVVLQITENR 1393
Cdd:cd14953 90 LYVADTGnhRIRKITPDGVVSTLAG----TGTAGFSDDGGA--TAAQFNYPTGVAVDAAGN-LYVADtgNHRIRKITPDG 162
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1394 QVRIAAGRPmhcqVPGveYPVGKHAVQTTLESATAIAVSYSGVLYITETDEkkiNRIRQVTTDGEISLVAGIPSEcdckn 1473
Cdd:cd14953 163 VVTTVAGTG----GAG--YAGDGPATAAQFNNPTGVAVDAAGNLYVADRGN---HRIRKITPDGVVTTVAGTGTA----- 228
|
250 260 270 280
....*....|....*....|....*....|....*....|....*..
gi 1958683751 1474 dancdcYQTGDGYAKDAKLSAPSSLAASPDGTLYIADLGNIRIRAVS 1520
Cdd:cd14953 229 ------GFSGDGGATAAQLNNPTGVAVDAAGNLYVADSGNHRIRKIT 269
|
|
| RhsA |
COG3209 |
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ... |
1473-2413 |
1.05e-33 |
|
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];
Pssm-ID: 442442 [Multi-domain] Cd Length: 1103 Bit Score: 142.97 E-value: 1.05e-33
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1473 NDANCDCYQTGDGYAKDAKLSAPSSLAASPDGTLYIADLGNIRIRAVSKNKPLLNSMNFYEVASPTDQELYIFDINGTHQ 1552
Cdd:COG3209 109 AAATASAGRLVSTGAGAGGTVTAATGGTLGATAGSATTGSTDGGRGGVAVTGLAGGGASAYGLTLGGAAAGPATGVGTGA 188
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1553 YTVSLVTGDYLYNFSYSNDNDVTAVTDSNGNTLRIRRDPNRMPVRVVSPDNQVIWLTIGTNGCLKSMTAQGLELVLFTYH 1632
Cdd:COG3209 189 VTLATGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPASVAATVTGSATGAAGAGAAVATAATTLGGTTGAGTG 268
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1633 GNSGLLATK---SDETGWTTFFDYDSEGRLTNVTFPTGVVTNLHGDMDKAITVDIESSSREEDVSITSNLSSIDSFYTMV 1709
Cdd:COG3209 269 ASGAGLDAStgtGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAGTTTTTGTGTGGTTTT 348
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1710 QDQLRNSYQIGYDGSLRIFYASGLDSHYQTEPHVLAGTANPTVAKRNMTLPGENGQNLVEWRFRKEQAQGKVNVFGRKLR 1789
Cdd:COG3209 349 VGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTTGGDGGPATAAGALTA 428
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1790 VNGRNLLSVDFDRTTKTEKIYDDHRKFLLRIAYDTSGHPTLWLPSSKLMAVNVTYSSTGQIASIQRGTTSEKVDYDSQGR 1869
Cdd:COG3209 429 GGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGATTLGTDTTLDDTLGG 508
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1870 IVSRVFADGKTWSYTYLEKSMVLLLHSQRQYIFEYDMWDRLSAITMPSVARHTMQTIRSIGYYRNIYNPPESNASIITDY 1949
Cdd:COG3209 509 TTTTTAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGDGTGGASTTTGTTGGT 588
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1950 NEEGLLLQTAFLGTSRRVLFKYRRQTRLSEILYDSTRVSFTYDETAGVLKTVNLQSDGFicTIRYRQIGPLIDRQIFRFS 2029
Cdd:COG3209 589 ATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGT--GVTTTGTTTTRATGTTGTG 666
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 2030 EDGMVNARFDYSYDNSFRVTSMQGVINETPLPIDLYQFDDISGKVEQFGKFGVIYYDINQIISTAVMTYTKHFDAHGRIK 2109
Cdd:COG3209 667 TGVTAGLTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTDGTGTGGTTGTLTT 746
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 2110 EIQYEifRSLMYWITIQYDNMGRVTKREIKIGPFANTTKYAYEYDVDGQLQTVYLNEKIMWRYNYDLNGNLHLLNPSSSA 2189
Cdd:COG3209 747 TSTTT--TTTAGALTYTYDALGRLTSETTPGGVTQGTYTTRYTYDALGRLTSVTYPDGETVTYTYDALGRLTSVITVGSG 824
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 2190 RLTPL-----RYDLRDRITRLgdvqyrldEDGFLRQRGTEIFEYSSKGLLTRVYSKGSGWTviYRYDGLGRRVSSKTSLG 2264
Cdd:COG3209 825 GGTDLqdrtyTYDAAGNITSI--------TDALRAGTLTQTYTYDALGRLTSATDPGTTES--YTYDANGNLTSRTDGGT 894
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 2265 QHLQFFYADLtyPTRITHvynhSSSEITSLYYDLQGHlfameissgdefyiaSDNTGTPLAVFSSNGLMLKQIQYTAYGE 2344
Cdd:COG3209 895 TTYTYDALGR--LVSVTK----PDGTTTTYTYDALGH---------------TDHLGSVRALTDASGQVVWRYDYDPFGN 953
|
890 900 910 920 930 940
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1958683751 2345 IYFDSNIDFQLVIGFHGGLYDPLTKLIHFGERDYDILAGRWTTPDieiwkRIGKDPAPfNLYMFRNNNP 2413
Cdd:COG3209 954 LLAETSGAAANPLRFTGQEYDAETGLYYNGARYYDPALGRFLSPD-----PIGLAGGL-NLYAYVGNNP 1016
|
|
| NHL_like_1 |
cd14953 |
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ... |
1277-1520 |
7.38e-31 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271323 [Multi-domain] Cd Length: 323 Bit Score: 125.72 E-value: 7.38e-31
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1277 VVAGTGeqclpfdeARCGDGGKAVEATLMSPKGMAIDKNGLIYFVDGT--MIRKVDQNGIISTLLG------SNDLTSAr 1348
Cdd:cd14953 3 TVAGSG--------TAGFSGGGGTAARFNSPSGVAVDAAGNLYVADRGnhRIRKITPDGVVTTVAGtgtagfADGGGAA- 73
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1349 pltcdtsmhisqVRLEWPTDLAINPMDNsIYVLD--NNVVLQITENRQVRIAAGrpmhcqVPGVEYPVGKHAVQTTLESA 1426
Cdd:cd14953 74 ------------AQFNTPSGVAVDAAGN-LYVADtgNHRIRKITPDGVVSTLAG------TGTAGFSDDGGATAAQFNYP 134
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1427 TAIAVSYSGVLYITETDEkkiNRIRQVTTDGEISLVAGIPSEcdckndancdcYQTGDGYAKDAKLSAPSSLAASPDGTL 1506
Cdd:cd14953 135 TGVAVDAAGNLYVADTGN---HRIRKITPDGVVTTVAGTGGA-----------GYAGDGPATAAQFNNPTGVAVDAAGNL 200
|
250
....*....|....
gi 1958683751 1507 YIADLGNIRIRAVS 1520
Cdd:cd14953 201 YVADRGNHRIRKIT 214
|
|
| NHL |
cd05819 |
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ... |
1179-1517 |
3.76e-19 |
|
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.
Pssm-ID: 271320 [Multi-domain] Cd Length: 269 Bit Score: 90.07 E-value: 3.76e-19
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1179 GQADGnKLLAPVALACGIDGSLYVGDF--NYVRRIFPSGNVTSVLELRNKDFRHSSNPAHryyLATDPvTGDLYVSDTNT 1256
Cdd:cd05819 1 GTGPG-ELNNPQGIAVDSSGNIYVADTgnNRIQVFDPDGNFITSFGSFGSGDGQFNEPAG---VAVDS-DGNLYVADTGN 75
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1257 RRIYRpksltgakdLTKNAEVVAGTGeqclpfdearcGDGGKAVEatLMSPKGMAIDKNGLIYFVDgTM---IRKVDQNG 1333
Cdd:cd05819 76 HRIQK---------FDPDGNFLASFG-----------GSGDGDGE--FNGPRGIAVDSSGNIYVAD-TGnhrIQKFDPDG 132
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1334 IISTLLGSNDLTSARpltcdtsmhisqvrLEWPTDLAINPmDNSIYVLDnnvvlqiTENRQVRI--AAGRPMHcQVPGVE 1411
Cdd:cd05819 133 EFLTTFGSGGSGPGQ--------------FNGPTGVAVDS-DGNIYVAD-------TGNHRIQVfdPDGNFLT-TFGSTG 189
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1412 YPVGKhavqttLESATAIAVSYSGVLYITETDEkkiNRIRQVTTDGEISlvagipsecdckndancdcYQTGDGYAKDAK 1491
Cdd:cd05819 190 TGPGQ------FNYPTGIAVDSDGNIYVADSGN---NRVQVFDPDGAGF-------------------GGNGNFLGSDGQ 241
|
330 340
....*....|....*....|....*.
gi 1958683751 1492 LSAPSSLAASPDGTLYIADLGNIRIR 1517
Cdd:cd05819 242 FNRPSGLAVDSDGNLYVADTGNNRIQ 267
|
|
| NHL |
cd05819 |
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ... |
1178-1450 |
5.32e-16 |
|
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.
Pssm-ID: 271320 [Multi-domain] Cd Length: 269 Bit Score: 80.83 E-value: 5.32e-16
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1178 NGQADGNkLLAPVALACGIDGSLYVGDF--NYVRRIFPSGNVTSVLELRNKDFRHSSNPahrYYLATDPvTGDLYVSDTN 1255
Cdd:cd05819 47 FGSGDGQ-FNEPAGVAVDSDGNLYVADTgnHRIQKFDPDGNFLASFGGSGDGDGEFNGP---RGIAVDS-SGNIYVADTG 121
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1256 TRRIYRpksltgakdLTKNAEVVAGTGeqclpfdearcgdGGKAVEATLMSPKGMAIDKNGLIYFVDGT--MIRKVDQNG 1333
Cdd:cd05819 122 NHRIQK---------FDPDGEFLTTFG-------------SGGSGPGQFNGPTGVAVDSDGNIYVADTGnhRIQVFDPDG 179
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1334 IISTLLGSNDLTSARpltcdtsmhisqvrLEWPTDLAINPMDNsIYVLD--NNVVLQITENRQVRIAAGRPMhCQVPGVE 1411
Cdd:cd05819 180 NFLTTFGSTGTGPGQ--------------FNYPTGIAVDSDGN-IYVADsgNNRVQVFDPDGAGFGGNGNFL-GSDGQFN 243
|
250 260 270
....*....|....*....|....*....|....*....
gi 1958683751 1412 YPVGkhavqttlesataIAVSYSGVLYITETDEKKINRI 1450
Cdd:cd05819 244 RPSG-------------LAVDSDGNLYVADTGNNRIQVF 269
|
|
| DUF5885 |
pfam19232 |
Family of unknown function (DUF5885); This is a family of uncharacterized proteins of unknown ... |
521-674 |
1.64e-09 |
|
Family of unknown function (DUF5885); This is a family of uncharacterized proteins of unknown function found in viruses.
Pssm-ID: 437064 Cd Length: 265 Bit Score: 61.18 E-value: 1.64e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 521 CHGNGECvsGTCH-CFPGFLGPDCSRACPVLCsGNGQ----------YSKGRC----LCFSGwkgTECDVPTTQCI-DPQ 584
Cdd:pfam19232 34 CTTDAQC--GTCMtCVAGACTPKASCCGGVTC-GAGQtcdaktntcvYVKGYCsadhPCPSG---SACDTAKNACIaQPP 107
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 585 CG---GRGiCIMG-------------------SCACNSGYK-GENCE--------EADCLDP---------------GCS 618
Cdd:pfam19232 108 YGpdsGKG-CVRGfgawiweldpatnsgvwrcRCANGSLYNsAHECSpladqtlcAAENLDPnalvpassvpafaayGWG 186
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 619 NHGVCIH-------------GECHCNPGWGGGNCEILKTmcpdqCSGHGTYLQESGSCTCD------------PN---WT 670
Cdd:pfam19232 187 NQPVLINkstagaavpsplaGVCPCKPGWAGGSCTEDRT-----CNGRGTWNETTGQCACNidfsghnscgddNNctsWT 261
|
....
gi 1958683751 671 GPDC 674
Cdd:pfam19232 262 GPRC 265
|
|
| Vgb |
COG4257 |
Streptogramin lyase [Defense mechanisms]; |
1189-1467 |
3.08e-09 |
|
Streptogramin lyase [Defense mechanisms];
Pssm-ID: 443399 [Multi-domain] Cd Length: 270 Bit Score: 60.42 E-value: 3.08e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1189 PVALACGIDGSLYVGDF--NYVRRIFP-SGNVTsvlelrnkdfRHSSNPAHRYY-LATDPvTGDLYVSDTNTRRIYRpks 1264
Cdd:COG4257 19 PRDVAVDPDGAVWFTDQggGRIGRLDPaTGEFT----------EYPLGGGSGPHgIAVDP-DGNLWFTDNGNNRIGR--- 84
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1265 LTGAkdlTKNAEVVAGTGEQCLPFdearcgdggkaveatlmspkGMAIDKNGLIYFVDGT--MIRKVD-QNGIISTLLGS 1341
Cdd:COG4257 85 IDPK---TGEITTFALPGGGSNPH--------------------GIAFDPDGNLWFTDQGgnRIGRLDpATGEVTEFPLP 141
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1342 NDLTSARPLTCD--------------------TSMHISQVRLE----WPTDLAINPmDNSIYVLD--NNVVLQITEnrqv 1395
Cdd:COG4257 142 TGGAGPYGIAVDpdgnlwvtdfganaigridpDTGTLTEYALPtpgaGPRGLAVDP-DGNLWVADtgSGRIGRFDP---- 216
|
250 260 270 280 290 300 310
....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1958683751 1396 riAAGRpmhcqvpgveypVGKHAVQTTLESATAIAVSYSGVLYITETDekkINRIRQVTTDGEISLVAgIPS 1467
Cdd:COG4257 217 --KTGT------------VTEYPLPGGGARPYGVAVDGDGRVWFAESG---ANRIVRFDPDTELTEYV-LPS 270
|
|
| Rhs_assc_core |
TIGR03696 |
RHS repeat-associated core domain; This model represents a conserved unique core sequence ... |
2339-2413 |
4.17e-09 |
|
RHS repeat-associated core domain; This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain.
Pssm-ID: 274730 [Multi-domain] Cd Length: 77 Bit Score: 55.20 E-value: 4.17e-09
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1958683751 2339 YTAYGEIYFDSNIDFQLvIGFHGGLYDPLTKLIHFGERDYDILAGRWTTPDieiwkRIGKDpAPFNLYMFRNNNP 2413
Cdd:TIGR03696 1 YDPYGEVLSESGAAPNP-LRFTGQYYDAETGLYYNGARYYDPELGRFLSPD-----PIGLG-GGLNLYAYVGNNP 68
|
|
| NHL_PKND_like |
cd14952 |
NHL repeat domain of the protein kinase PknD; PknD is a mycobacterial transmembrane protein ... |
1240-1516 |
9.70e-09 |
|
NHL repeat domain of the protein kinase PknD; PknD is a mycobacterial transmembrane protein with a cytosolic kinase domain and an extracellular sensor domain that contains NHL repeats. It plays a key role in the development of central nervous system tuberculosis, by mediating the invasion of host brain endothelia. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271322 [Multi-domain] Cd Length: 247 Bit Score: 58.76 E-value: 9.70e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1240 LATDPvTGDLYVSDTNTRRIYRpksltgakdltknaeVVAGTGEQ-CLPFDEarcgdggkaveatLMSPKGMAIDKNGLI 1318
Cdd:cd14952 15 VAVDA-AGNVYVADSGNNRVLK---------------LAAGSTTQtVLPFTG-------------LYQPQGVAVDAAGTV 65
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1319 YFVDGtmirkvDQNGIISTLLGSNDLTsARPLTcdtsmhisqvRLEWPTDLAINPMDNsIYVLDNnvvlqiTENRQVRIA 1398
Cdd:cd14952 66 YVTDF------GNNRVLKLAAGSTTQT-VLPFT----------GLNDPTGVAVDAAGN-VYVADT------GNNRVLKLA 121
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1399 AGRPMHCQVPgveypvgkhavQTTLESATAIAVSYSGVLYITETDEKKINRIRQVTTD------GEISLVAGIPSecdck 1472
Cdd:cd14952 122 AGSNTQTVLP-----------FTGLSNPDGVAVDGAGNVYVTDTGNNRVLKLAAGSTTqtvlpfTGLNSPSGVAV----- 185
|
250 260 270 280 290
....*....|....*....|....*....|....*....|....*....|....*....
gi 1958683751 1473 nDANCDCYQTGDGYAKDAKLSA---------------PSSLAASPDGTLYIADLGNIRI 1516
Cdd:cd14952 186 -DTAGNVYVTDHGNNRVLKLAAgsttptvlpftglngPLGVAVDAAGNVYVADRGNDRV 243
|
|
| NHL_like_2 |
cd14957 |
Uncharacterized NHL-repeat domain in bacterial and archaeal proteins; The NHL (NCL-1, HT2A and ... |
1304-1614 |
3.47e-08 |
|
Uncharacterized NHL-repeat domain in bacterial and archaeal proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271327 [Multi-domain] Cd Length: 280 Bit Score: 57.66 E-value: 3.47e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1304 LMSPKGMAIDKNGLIYFVD--GTMIRKVDQNGIISTLLGSNDltsarpltcdtsmhISQVRLEWPTDLAINPMDNsIYVL 1381
Cdd:cd14957 17 FNTPRGIAVDSAGNIYVADtgNNRIQVFTSSGVYSYSIGSGG--------------TGSGQFNSPYGIAVDSNGN-IYVA 81
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1382 DNNvvlqitENR-QVRIAAGrpmhcqvpGVEYPVGKHAVQTT-LESATAIAVSYSGVLYITETDEkkiNRIRQVTTDGEI 1459
Cdd:cd14957 82 DTD------NNRiQVFNSSG--------VYQYSIGTGGSGDGqFNGPYGIAVDSNGNIYVADTGN---HRIQVFTSSGTF 144
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1460 slvagipsecdckndancdCYQTGDGYAKDAKLSAPSSLAASPDGTLYIADLGNIRIRavsknkpllnsmnfyevasptd 1539
Cdd:cd14957 145 -------------------SYSIGSGGTGPGQFNGPQGIAVDSDGNIYVADTGNHRIQ---------------------- 183
|
250 260 270 280 290 300 310
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1958683751 1540 qelyIFDINGTHQYTV-SLVTGDYLynFSYSNDNDVtavtDSNGNTLRIRRDPNRmpVRVVSPDNqVIWLTIGTNG 1614
Cdd:cd14957 184 ----VFTSSGTFQYTFgSSGSGPGQ--FSDPYGIAV----DSDGNIYVADTGNHR--IQVFTSSG-AYQYSIGTSG 246
|
|
| acid_disulf_rpt |
NF033662 |
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with ... |
790-820 |
6.35e-08 |
|
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with four nearly invariant Cys residues in a repeat length of about 35 amino acids.
Pssm-ID: 411265 [Multi-domain] Cd Length: 32 Bit Score: 50.59 E-value: 6.35e-08
10 20 30
....*....|....*....|....*....|.
gi 1958683751 790 AMETLCTDSKDNEGDGLVDCMDPDCCLQSSC 820
Cdd:NF033662 2 ATDTTCSDGIDNDGDGLTDCADPDCAGNPVC 32
|
|
| C_rich_MXAN6577 |
NF041328 |
MXAN_6577-like cysteine-rich domain; |
571-725 |
5.20e-07 |
|
MXAN_6577-like cysteine-rich domain;
Pssm-ID: 469225 [Multi-domain] Cd Length: 145 Bit Score: 51.30 E-value: 5.20e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 571 TECDVPTTQCIDPQ--CGGRGICIMGScACNSGykgeNCEEAdcldpgCSNHGVCIHGECHCNPGwgggnceilKTMCPD 648
Cdd:NF041328 12 AGCPEPGAVCPEGLsvCGGACVDLRSD-PSNCG----ACGVA------CGAGQTCVAGACGCGPG---------TVACGG 71
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 649 QCSGHGTylqesgsctcDPNWTGPdcsneiCSVDCGSHGVCMGGTCR--CEEGWT--GPAC--------NQRACHPRCAE 716
Cdd:NF041328 72 ACVDTAS----------DPAHCGA------CGAACAPGQVCEGGACReaCSEGLTrcGGACvdlatdplHCGACGVACDP 135
|
....*....
gi 1958683751 717 HGTCKDGKC 725
Cdd:NF041328 136 GESCRGGAC 144
|
|
| Vgb |
COG4257 |
Streptogramin lyase [Defense mechanisms]; |
1238-1517 |
1.10e-06 |
|
Streptogramin lyase [Defense mechanisms];
Pssm-ID: 443399 [Multi-domain] Cd Length: 270 Bit Score: 52.71 E-value: 1.10e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1238 YYLATDPvTGDLYVSDTNTRRIYRpksltgakdltknaeVVAGTGEqclpFDEARCGDGGkaveatlmSPKGMAIDKNGL 1317
Cdd:COG4257 20 RDVAVDP-DGAVWFTDQGGGRIGR---------------LDPATGE----FTEYPLGGGS--------GPHGIAVDPDGN 71
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1318 IYFVDGT--MIRKVD-QNGIISTLLGSNDLTSarpltcdtsmhisqvrlewPTDLAINPmDNSIYVLD--NNVVLQIT-E 1391
Cdd:COG4257 72 LWFTDNGnnRIGRIDpKTGEITTFALPGGGSN-------------------PHGIAFDP-DGNLWFTDqgGNRIGRLDpA 131
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1392 NRQVRiaagrpmhcqvpgvEYPVGKHAVQTTlesatAIAVSYSGVLYITETdekKINRIRQVTTD-GEISLvagipsecd 1470
Cdd:COG4257 132 TGEVT--------------EFPLPTGGAGPY-----GIAVDPDGNLWVTDF---GANAIGRIDPDtGTLTE--------- 180
|
250 260 270 280
....*....|....*....|....*....|....*....|....*..
gi 1958683751 1471 ckndancdcyqtgdgYAKDAKLSAPSSLAASPDGTLYIADLGNIRIR 1517
Cdd:COG4257 181 ---------------YALPTPGAGPRGLAVDPDGNLWVADTGSGRIG 212
|
|
| PLN02919 |
PLN02919 |
haloacid dehalogenase-like hydrolase family protein |
1240-1523 |
2.51e-06 |
|
haloacid dehalogenase-like hydrolase family protein
Pssm-ID: 215497 [Multi-domain] Cd Length: 1057 Bit Score: 53.32 E-value: 2.51e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1240 LATDPVTGDLYVSDTNTRRIYrpksltgAKDLTKNAEV-VAGTGEQCL---PFDEArcgdggkaveaTLMSPKGMAID-K 1314
Cdd:PLN02919 573 LAIDLLNNRLFISDSNHNRIV-------VTDLDGNFIVqIGSTGEEGLrdgSFEDA-----------TFNRPQGLAYNaK 634
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1315 NGLIYFVD--GTMIRKVD-QNGIISTLLGS----NDLTSARPLTcdtsmhiSQVrLEWPTDLAINPMDNSIYVLDNNVvL 1387
Cdd:PLN02919 635 KNLLYVADteNHALREIDfVNETVRTLAGNgtkgSDYQGGKKGT-------SQV-LNSPWDVCFEPVNEKVYIAMAGQ-H 705
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1388 QITENRqvrIAAGRPMHCQVPGVEYPV-GKHAVQTTLESATAIAVSYS-GVLYITETDEKKInRIRQVTTDGEISLVAGI 1465
Cdd:PLN02919 706 QIWEYN---ISDGVTRVFSGDGYERNLnGSSGTSTSFAQPSGISLSPDlKELYIADSESSSI-RALDLKTGGSRLLAGGD 781
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1466 PSECD------------------------CKNDAN---CDCYQ------------------TG-----DGYAKDAKLSAP 1495
Cdd:PLN02919 782 PTFSDnlfkfgdhdgvgsevllqhplgvlCAKDGQiyvADSYNhkikkldpatkrvttlagTGkagfkDGKALKAQLSEP 861
|
330 340
....*....|....*....|....*...
gi 1958683751 1496 SSLAASPDGTLYIADLGNIRIRAVSKNK 1523
Cdd:PLN02919 862 AGLALGENGRLFVADTNNSLIRYLDLNK 889
|
|
| NHL_PKND_like |
cd14952 |
NHL repeat domain of the protein kinase PknD; PknD is a mycobacterial transmembrane protein ... |
1186-1447 |
4.06e-06 |
|
NHL repeat domain of the protein kinase PknD; PknD is a mycobacterial transmembrane protein with a cytosolic kinase domain and an extracellular sensor domain that contains NHL repeats. It plays a key role in the development of central nervous system tuberculosis, by mediating the invasion of host brain endothelia. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271322 [Multi-domain] Cd Length: 247 Bit Score: 50.67 E-value: 4.06e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1186 LLAPVALACGIDGSLYVGDF--NYVRRIFPSGNVTSVLElrnkdFRHSSNPAHryyLATDPVtGDLYVSDTNTRRIyrpk 1263
Cdd:cd14952 51 LYQPQGVAVDAAGTVYVTDFgnNRVLKLAAGSTTQTVLP-----FTGLNDPTG---VAVDAA-GNVYVADTGNNRV---- 117
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1264 sltgakdltknAEVVAGTGEQC-LPFdearcgdggkaveATLMSPKGMAIDKNGLIYFVDGtmirkvDQNGIISTLLGSN 1342
Cdd:cd14952 118 -----------LKLAAGSNTQTvLPF-------------TGLSNPDGVAVDGAGNVYVTDT------GNNRVLKLAAGST 167
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1343 DLTsARPLTCDTSmhisqvrlewPTDLAINPMDNsIYVLDNNvvlqitENRQVRIAAGRPMHCQVP--GVEYPVGkhavq 1420
Cdd:cd14952 168 TQT-VLPFTGLNS----------PSGVAVDTAGN-VYVTDHG------NNRVLKLAAGSTTPTVLPftGLNGPLG----- 224
|
250 260
....*....|....*....|....*..
gi 1958683751 1421 ttlesataIAVSYSGVLYITETDEKKI 1447
Cdd:cd14952 225 --------VAVDAAGNVYVADRGNDRV 243
|
|
| DSL |
pfam01414 |
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain ... |
696-739 |
1.60e-05 |
|
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain defined by structure.
Pssm-ID: 460202 Cd Length: 46 Bit Score: 44.15 E-value: 1.60e-05
10 20 30 40
....*....|....*....|....*....|....*....|....*..
gi 1958683751 696 CEEGWTGPACNqRACHPRCAE--HGTC-KDGKCECSQGWNGEHCTIA 739
Cdd:pfam01414 1 CDENYYGSTCS-KFCRPRDDKfgHYTCdANGNKVCLPGWTGPYCDKP 46
|
|
| NHL |
cd05819 |
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ... |
1488-1614 |
1.31e-04 |
|
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.
Pssm-ID: 271320 [Multi-domain] Cd Length: 269 Bit Score: 46.54 E-value: 1.31e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1488 KDAKLSAPSSLAASPDGTLYIADLGNIRIRAVSKN-KPLLN-------SMNFYE---VASPTDQELYI----------FD 1546
Cdd:cd05819 3 GPGELNNPQGIAVDSSGNIYVADTGNNRIQVFDPDgNFITSfgsfgsgDGQFNEpagVAVDSDGNLYVadtgnhriqkFD 82
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1958683751 1547 INGTHQYTVSlVTGDYLYNFSY------SNDNDVtAVTDSNGNtlRIrrdpnrmpvRVVSPDNQVIwLTIGTNG 1614
Cdd:cd05819 83 PDGNFLASFG-GSGDGDGEFNGprgiavDSSGNI-YVADTGNH--RI---------QKFDPDGEFL-TTFGSGG 142
|
|
| C_rich_MXAN6577 |
NF041328 |
MXAN_6577-like cysteine-rich domain; |
679-781 |
1.43e-04 |
|
MXAN_6577-like cysteine-rich domain;
Pssm-ID: 469225 [Multi-domain] Cd Length: 145 Bit Score: 44.36 E-value: 1.43e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 679 CSVDCGSHGVCMGGTCRCEEGWT--GPAC--------NQRACHPRCAEHGTCKDGKCecsqgwngehctiahyldkivka 748
Cdd:NF041328 45 CGVACGAGQTCVAGACGCGPGTVacGGACvdtasdpaHCGACGAACAPGQVCEGGAC----------------------- 101
|
90 100 110 120
....*....|....*....|....*....|....*....|
gi 1958683751 749 dkigyKEGCP-GLCNSNGRCT-LDQNGWHC-----VCQPG 781
Cdd:NF041328 102 -----REACSeGLTRCGGACVdLATDPLHCgacgvACDPG 136
|
|
| DSL |
pfam01414 |
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain ... |
665-707 |
1.48e-04 |
|
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain defined by structure.
Pssm-ID: 460202 Cd Length: 46 Bit Score: 41.46 E-value: 1.48e-04
10 20 30 40
....*....|....*....|....*....|....*....|....*.
gi 1958683751 665 CDPNWTGPDCSNEiCSV--DCGSHGVC-MGGTCRCEEGWTGPACNQ 707
Cdd:pfam01414 1 CDENYYGSTCSKF-CRPrdDKFGHYTCdANGNKVCLPGWTGPYCDK 45
|
|
| NHL_like_5 |
cd14963 |
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) ... |
1185-1442 |
1.63e-04 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271333 [Multi-domain] Cd Length: 268 Bit Score: 46.13 E-value: 1.63e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1185 KLLAPVALACGIDGSLYVGDFnYVRRI--F-PSGNVTSVLElRNKDFRHSSNPAHryyLATDpvTGDLYVSDTNTRRIYr 1261
Cdd:cd14963 54 EFKYPYGIAVDSDGNIYVADL-YNGRIqvFdPDGKFLKYFP-EKKDRVKLISPAG---LAID--DGKLYVSDVKKHKVI- 125
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1262 pksltgakdltknaeVVAGTGEQCLPFdearcGDGGKAvEATLMSPKGMAIDKNGLIYFVD--GTMIRKVDQNG-IISTL 1338
Cdd:cd14963 126 ---------------VFDLEGKLLLEF-----GKPGSE-PGELSYPNGIAVDEDGNIYVADsgNGRIQVFDKNGkFIKEL 184
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1339 LGSNDLTSArpltcdtsmhisqvrLEWPTDLAINPmDNSIYVLDN--NVVLQITENRQVRIAAGRpmhcqvPGVEypvgk 1416
Cdd:cd14963 185 NGSPDGKSG---------------FVNPRGIAVDP-DGNLYVVDNlsHRVYVFDEQGKELFTFGG------RGKD----- 237
|
250 260
....*....|....*....|....*.
gi 1958683751 1417 havQTTLESATAIAVSYSGVLYITET 1442
Cdd:cd14963 238 ---DGQFNLPNGLFIDDDGRLYVTDR 260
|
|
| NHL_like_1 |
cd14953 |
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ... |
1483-1522 |
3.13e-04 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271323 [Multi-domain] Cd Length: 323 Bit Score: 45.60 E-value: 3.13e-04
10 20 30 40
....*....|....*....|....*....|....*....|
gi 1958683751 1483 GDGYAKDAKLSAPSSLAASPDGTLYIADLGNIRIRAVSKN 1522
Cdd:cd14953 13 SGGGGTAARFNSPSGVAVDAAGNLYVADRGNHRIRKITPD 52
|
|
| EGF_2 |
pfam07974 |
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. |
521-543 |
4.78e-04 |
|
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.
Pssm-ID: 400365 Cd Length: 26 Bit Score: 39.64 E-value: 4.78e-04
|
| EGF_2 |
pfam07974 |
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. |
650-674 |
1.13e-03 |
|
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.
Pssm-ID: 400365 Cd Length: 26 Bit Score: 38.48 E-value: 1.13e-03
|
| YvrE |
COG3386 |
Sugar lactone lactonase YvrE [Carbohydrate transport and metabolism]; Sugar lactone lactonase ... |
1192-1335 |
2.12e-03 |
|
Sugar lactone lactonase YvrE [Carbohydrate transport and metabolism]; Sugar lactone lactonase YvrE is part of the Pathway/BioSystem: Non-phosphorylated Entner-Doudoroff pathway
Pssm-ID: 442613 [Multi-domain] Cd Length: 266 Bit Score: 42.57 E-value: 2.12e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1192 LACGIDGSLYVGDFNYVR------RIFPSGNVTSVLElrnkDFrHSSN-----PAHRYylatdpvtgdLYVSDTNTRRIY 1260
Cdd:COG3386 98 GVVDPDGRLYFTDMGEYLptgalyRVDPDGSLRVLAD----GL-TFPNgiafsPDGRT----------LYVADTGAGRIY 162
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1958683751 1261 R-PKSLTGAkdLTkNAEVVAgtgeqclpfdEARCGDGGkaveatlmsPKGMAIDKNGLIY--FVDGTMIRKVDQNGII 1335
Cdd:COG3386 163 RfDLDADGT--LG-NRRVFA----------DLPDGPGG---------PDGLAVDADGNLWvaLWGGGGVVRFDPDGEL 218
|
|
| NHL_like_6 |
cd14962 |
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) ... |
1297-1516 |
2.33e-03 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271332 [Multi-domain] Cd Length: 271 Bit Score: 42.57 E-value: 2.33e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1297 GKAVEATLMSPKGMAIDKNGLIYFVDGT--MIRKVDQNGIISTLLGSNDLtsarpltcdtsmhisQVRlewPTDLAINPM 1374
Cdd:cd14962 49 GNAGPNRFVSPIGVAIDANGNLYVSDAElgKVFVFDRDGKFLRAIGAGAL---------------FKR---PTGIAVDPA 110
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1375 DNSIYVLDnnvvlqiTENRQVRI--AAGRPMHcQVPgveyPVGKHAVQttLESATAIAVSYSGVLYITETDEKKINRI-- 1450
Cdd:cd14962 111 GKRLYVVD-------TLAHKVKVfdLDGRLLF-DIG----KRGSGPGE--FNLPTDLAVDRDGNLYVTDTMNFRVQIFda 176
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1451 --RQVTTDGEISLVAG---IPSECDCKNDAN---CDCY--------QTGD-------GYAKDAKLSAPSSLAASPDGTLY 1507
Cdd:cd14962 177 dgKFLRSFGERGDGPGsfaRPKGIAVDSEGNiyvVDAAfdnvqifnPEGEllltvggPGSGPGEFYLPSGIAIDKDDRIY 256
|
....*....
gi 1958683751 1508 IADLGNIRI 1516
Cdd:cd14962 257 VVDQFNRRI 265
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
611-640 |
2.48e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 37.62 E-value: 2.48e-03
10 20 30
....*....|....*....|....*....|....*
gi 1958683751 611 DCLDPG-CSNHGVCIHGE----CHCNPGWGGGNCE 640
Cdd:cd00054 4 ECASGNpCQNGGTCVNTVgsyrCSCPPGYTGRNCE 38
|
|
| YD_repeat_2x |
TIGR01643 |
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ... |
1844-1884 |
2.50e-03 |
|
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.
Pssm-ID: 273728 [Multi-domain] Cd Length: 42 Bit Score: 37.95 E-value: 2.50e-03
10 20 30 40
....*....|....*....|....*....|....*....|..
gi 1958683751 1844 YSSTGQ-IASIQRGTTSEKVDYDSQGRIVSRVFADGKTWSYT 1884
Cdd:TIGR01643 1 YDAAGRlTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRYE 42
|
|
| Vgb |
COG4257 |
Streptogramin lyase [Defense mechanisms]; |
1184-1261 |
2.63e-03 |
|
Streptogramin lyase [Defense mechanisms];
Pssm-ID: 443399 [Multi-domain] Cd Length: 270 Bit Score: 42.31 E-value: 2.63e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1184 NKLLAPVALACGIDGSLYVGDF--NYVRRIFP-SGNVTSvlelrnkdFRHSSNPAHRYYLATDPvTGDLYVSDTNTRRIY 1260
Cdd:COG4257 185 TPGAGPRGLAVDPDGNLWVADTgsGRIGRFDPkTGTVTE--------YPLPGGGARPYGVAVDG-DGRVWFAESGANRIV 255
|
.
gi 1958683751 1261 R 1261
Cdd:COG4257 256 R 256
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
758-787 |
3.58e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 37.23 E-value: 3.58e-03
10 20 30
....*....|....*....|....*....|
gi 1958683751 758 PGLCNSNGRCTLDQNGWHCVCQPGWRGAGC 787
Cdd:cd00054 8 GNPCQNGGTCVNTVGSYRCSCPPGYTGRNC 37
|
|
| NHL_like_5 |
cd14963 |
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) ... |
1491-1587 |
4.59e-03 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271333 [Multi-domain] Cd Length: 268 Bit Score: 41.51 E-value: 4.59e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 1491 KLSAPSSLAASPDGTLYIADLGNIRIRAVSKN----------KPLLNSM----------NFYeVASPTDQELYIFDINGT 1550
Cdd:cd14963 54 EFKYPYGIAVDSDGNIYVADLYNGRIQVFDPDgkflkyfpekKDRVKLIspaglaiddgKLY-VSDVKKHKVIVFDLEGK 132
|
90 100 110 120
....*....|....*....|....*....|....*....|..
gi 1958683751 1551 HQYTVSLVtGDYLYNFSYSN----DNDVT-AVTDSNGNtlRI 1587
Cdd:cd14963 133 LLLEFGKP-GSEPGELSYPNgiavDEDGNiYVADSGNG--RI 171
|
|
| Keratin_B2 |
pfam01500 |
Keratin, high sulfur B2 protein; High sulfur proteins are cysteine-rich proteins synthesized ... |
607-725 |
4.96e-03 |
|
Keratin, high sulfur B2 protein; High sulfur proteins are cysteine-rich proteins synthesized during the differentiation of hair matrix cells, and form hair fibres in association with hair keratin intermediate filaments. This family has been divided up into four regions, with the second region containing 8 copies of a short repeat. This family is also known as B2 or KAP1.
Pssm-ID: 366678 [Multi-domain] Cd Length: 161 Bit Score: 40.16 E-value: 4.96e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958683751 607 CEEADCLDPGCSNHGVCihGECHCNPGWGGGNCeILKTMCPDQCSGHGTYLQESGSCTCDPNwTGPDCSNEICSVDCGSH 686
Cdd:pfam01500 4 CGTSFCGFPTCSTGGTC--GSGCCQPCCCQSSC-CRPSCCQTSCCQPTTFQSSCCRPTCQPC-CQTSCCQPTCCQTSSCQ 79
|
90 100 110
....*....|....*....|....*....|....*....
gi 1958683751 687 GVCMGGTCRCEEGWTGPACNQRACHPRCAEHGTCKDGKC 725
Cdd:pfam01500 80 TGCGGIGYGQEGSSGAVSSRTRWCRPDCRVEGTCLPPCC 118
|
|
| EGF_2 |
pfam07974 |
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. |
617-639 |
5.30e-03 |
|
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.
Pssm-ID: 400365 Cd Length: 26 Bit Score: 36.56 E-value: 5.30e-03
|
| EGF_Tenascin |
pfam18720 |
Tenascin EGF domain; This entry represents the EGF-like domains found in tenascin proteins. |
585-607 |
7.86e-03 |
|
Tenascin EGF domain; This entry represents the EGF-like domains found in tenascin proteins.
Pssm-ID: 376143 Cd Length: 29 Bit Score: 36.12 E-value: 7.86e-03
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
1636-1667 |
8.62e-03 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 36.04 E-value: 8.62e-03
10 20 30
....*....|....*....|....*....|..
gi 1958683751 1636 GLLATKSDETGWTTFFDYDSEGRLTNVTFPTG 1667
Cdd:pfam05593 5 GRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDG 36
|
|
| EGF_2 |
pfam07974 |
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. |
585-607 |
9.01e-03 |
|
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.
Pssm-ID: 400365 Cd Length: 26 Bit Score: 35.79 E-value: 9.01e-03
|
| EGF |
pfam00008 |
EGF-like domain; There is no clear separation between noise and signal. pfam00053 is very ... |
761-784 |
9.28e-03 |
|
EGF-like domain; There is no clear separation between noise and signal. pfam00053 is very similar, but has 8 instead of 6 conserved cysteines. Includes some cytokine receptors. The EGF domain misses the N-terminus regions of the Ca2+ binding EGF domains (this is the main reason of discrepancy between swiss-prot domain start/end and Pfam). The family is hard to model due to many similar but different sub-types of EGF domains. Pfam certainly misses a number of EGF domains.
Pssm-ID: 394967 Cd Length: 31 Bit Score: 35.82 E-value: 9.28e-03
|
|