NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1904527050|gb|KAF7469179|]
View 

transcription elongation factor SPT5 [Marmota monax]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
NGN_Euk cd09888
Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW ...
113-192 1.12e-37

Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW domain-containing Transcription Factor 1); The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus. Spt5 forms an Spt4-Spt5 complex that is an essential RNA polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli, and has a variety of functions such as its involvement in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. Spt5-like is homologous to the Spt5 proteins present in all eukaryotes, which is unique as it encodes a protein with an additional long carboxy-terminal extension that contains WG/GW motifs. Spt5-like, or KTF1 (KOW domain-containing Transcription Factor 1), is a RNA-directed DNA methylation (RdDM) pathway effector in plants.


:

Pssm-ID: 193577 [Multi-domain]  Cd Length: 86  Bit Score: 135.74  E-value: 1.12e-37
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  113 IGEERATAISLMRKFIAYQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLgyWNQQMVPIKEMTDVLK 192
Cdd:cd09888      9 PGKEREIVISLMRKFLDLQRTGNPLGIKSVFARDGLKGYIYIEARKEAHVKDAIEGLRGVYL--NTIKLVPIKEMPDVLS 86
CTD smart01104
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
699-816 1.15e-29

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif.


:

Pssm-ID: 215026 [Multi-domain]  Cd Length: 121  Bit Score: 114.16  E-value: 1.15e-29
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050   699 GSQTPMYG-SGSRTPMYGSQTP----LQDGSRTPHYGSQTPLHDG--SRTPAQSGAWdPNNPNTPSRAEEEYEYAFDDEP 771
Cdd:smart01104    1 GGRTPAWGaSGSKTPAWGSRTPgtaaGGAPTARGGSGSRTPAWGGagSRTPAWGGAG-PTGSRTPAWGGASAWGNKSSEG 79
                            90       100       110       120
                    ....*....|....*....|....*....|....*....|....*..
gi 1904527050   772 TPSPQA--YGGTPNPQTPGYpdpssPQVNPQYNPQTPGTPAMYNTDQ 816
Cdd:smart01104   80 SASSWAagPGGAYGAPTPGY-----GGTPSAYGPATPGGGAMAGSAS 121
KOW_Spt5_3 cd06083
KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
398-448 2.58e-27

KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240507  Cd Length: 51  Bit Score: 104.91  E-value: 2.58e-27
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1904527050  398 YFKMGDHVKVIAGRFEGDTGLIVRVEENFIILFSDLTMHELKVLPRDLQLC 448
Cdd:cd06083      1 HFKVGDHVKVISGRHEGETGLVVKVEDDVVTVFSDLTMRELKVFPRDLQLS 51
KOW_Spt5_6 cd06086
KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
955-1011 7.03e-27

KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240510  Cd Length: 58  Bit Score: 103.75  E-value: 7.03e-27
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 1904527050  955 EHLEPITPTKNNKVKVILGEDREATGVLLSIDGEDGIVRMDLDEQLKILNLRFLGKL 1011
Cdd:cd06086      1 EHLEPVPPEKGDRVKVIKGEDRGSTGELISIDGADGIVKMDSDGDIKILPMNFLAKL 57
KOW_Spt5_2 cd06082
KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
347-397 1.31e-26

KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240506  Cd Length: 51  Bit Score: 102.97  E-value: 1.31e-26
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1904527050  347 FQPGDNVEVCEGELINLQGKVLSVDGNKITIMPKHEDLKDMLEFPAQELRK 397
Cdd:cd06082      1 FQPGDNVEVIEGELKGLQGKVESVDGDIVTIMPKHEDLKEPLEFPAKELRK 51
KOW_Spt5_5 cd06085
KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
629-678 1.26e-25

KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240509  Cd Length: 52  Bit Score: 100.25  E-value: 1.26e-25
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|
gi 1904527050  629 DNELIGQTVRISQGPYKGYIGVVKDATESTARVELHSTCQTISVDRQRLT 678
Cdd:cd06085      2 RDPLIGKTVRIRKGPYKGYIGIVKDATGTTARVELHSKNKTITVDRSRLA 51
KOW_Spt5_1 cd06081
KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
203-240 4.51e-18

KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240505  Cd Length: 38  Bit Score: 78.28  E-value: 4.51e-18
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 1904527050  203 KSWVRLKRGIYKDDIAQVDYVEPSQNTISLKMIPRIDY 240
Cdd:cd06081      1 GSWVRIKRGIYKGDLAQVDEVDENGNRVVVKLIPRIDY 38
KOW_Spt5_4 cd06084
KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
524-566 5.27e-18

KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240508  Cd Length: 43  Bit Score: 78.33  E-value: 5.27e-18
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|...
gi 1904527050  524 KDIVKVIDGPHSDREGEIRHLYHSFAFLHCKKLVENGGMFVCK 566
Cdd:cd06084      1 GDTVKVVDGPYKGRQGTVLHIYRGTLFLHSREVTENGGIFVVR 43
PHA03269 super family cl29788
envelope glycoprotein C; Provisional
752-897 1.16e-08

envelope glycoprotein C; Provisional


The actual alignment was detected with superfamily member PHA03269:

Pssm-ID: 165527 [Multi-domain]  Cd Length: 566  Bit Score: 58.97  E-value: 1.16e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  752 NPNTPSRAEEEYEYAFDDEPTPSPQayggtPNPQTPGYPDPS-SPQVNPQYNPQtpgtpamyntdqfsPYAAPSPQGSYQ 830
Cdd:PHA03269    21 NLNTNIPIPELHTSAATQKPDPAPA-----PHQAASRAPDPAvAPTSAASRKPD--------------LAQAPTPAASEK 81
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1904527050  831 PSPSPQSYHQV--APSPAGYQNTHSPasyhPTPSPM-----AYQASPSPSPVGYSPMTPgAPSPGGYNPHTPGS 897
Cdd:PHA03269    82 FDPAPAPHQAAsrAPDPAVAPQLAAA----PKPDAAeaftsAAQAHEAPADAGTSAASK-KPDPAAHTQHSPPP 150
 
Name Accession Description Interval E-value
NGN_Euk cd09888
Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW ...
113-192 1.12e-37

Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW domain-containing Transcription Factor 1); The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus. Spt5 forms an Spt4-Spt5 complex that is an essential RNA polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli, and has a variety of functions such as its involvement in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. Spt5-like is homologous to the Spt5 proteins present in all eukaryotes, which is unique as it encodes a protein with an additional long carboxy-terminal extension that contains WG/GW motifs. Spt5-like, or KTF1 (KOW domain-containing Transcription Factor 1), is a RNA-directed DNA methylation (RdDM) pathway effector in plants.


Pssm-ID: 193577 [Multi-domain]  Cd Length: 86  Bit Score: 135.74  E-value: 1.12e-37
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  113 IGEERATAISLMRKFIAYQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLgyWNQQMVPIKEMTDVLK 192
Cdd:cd09888      9 PGKEREIVISLMRKFLDLQRTGNPLGIKSVFARDGLKGYIYIEARKEAHVKDAIEGLRGVYL--NTIKLVPIKEMPDVLS 86
CTD smart01104
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
699-816 1.15e-29

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif.


Pssm-ID: 215026 [Multi-domain]  Cd Length: 121  Bit Score: 114.16  E-value: 1.15e-29
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050   699 GSQTPMYG-SGSRTPMYGSQTP----LQDGSRTPHYGSQTPLHDG--SRTPAQSGAWdPNNPNTPSRAEEEYEYAFDDEP 771
Cdd:smart01104    1 GGRTPAWGaSGSKTPAWGSRTPgtaaGGAPTARGGSGSRTPAWGGagSRTPAWGGAG-PTGSRTPAWGGASAWGNKSSEG 79
                            90       100       110       120
                    ....*....|....*....|....*....|....*....|....*..
gi 1904527050   772 TPSPQA--YGGTPNPQTPGYpdpssPQVNPQYNPQTPGTPAMYNTDQ 816
Cdd:smart01104   80 SASSWAagPGGAYGAPTPGY-----GGTPSAYGPATPGGGAMAGSAS 121
KOW_Spt5_3 cd06083
KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
398-448 2.58e-27

KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240507  Cd Length: 51  Bit Score: 104.91  E-value: 2.58e-27
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1904527050  398 YFKMGDHVKVIAGRFEGDTGLIVRVEENFIILFSDLTMHELKVLPRDLQLC 448
Cdd:cd06083      1 HFKVGDHVKVISGRHEGETGLVVKVEDDVVTVFSDLTMRELKVFPRDLQLS 51
KOW_Spt5_6 cd06086
KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
955-1011 7.03e-27

KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240510  Cd Length: 58  Bit Score: 103.75  E-value: 7.03e-27
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 1904527050  955 EHLEPITPTKNNKVKVILGEDREATGVLLSIDGEDGIVRMDLDEQLKILNLRFLGKL 1011
Cdd:cd06086      1 EHLEPVPPEKGDRVKVIKGEDRGSTGELISIDGADGIVKMDSDGDIKILPMNFLAKL 57
KOW_Spt5_2 cd06082
KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
347-397 1.31e-26

KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240506  Cd Length: 51  Bit Score: 102.97  E-value: 1.31e-26
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1904527050  347 FQPGDNVEVCEGELINLQGKVLSVDGNKITIMPKHEDLKDMLEFPAQELRK 397
Cdd:cd06082      1 FQPGDNVEVIEGELKGLQGKVESVDGDIVTIMPKHEDLKEPLEFPAKELRK 51
KOW_Spt5_5 cd06085
KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
629-678 1.26e-25

KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240509  Cd Length: 52  Bit Score: 100.25  E-value: 1.26e-25
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|
gi 1904527050  629 DNELIGQTVRISQGPYKGYIGVVKDATESTARVELHSTCQTISVDRQRLT 678
Cdd:cd06085      2 RDPLIGKTVRIRKGPYKGYIGIVKDATGTTARVELHSKNKTITVDRSRLA 51
Spt5-NGN pfam03439
Early transcription elongation factor of RNA pol II, NGN section; Spt5p and prokaryotic NusG ...
113-191 1.63e-25

Early transcription elongation factor of RNA pol II, NGN section; Spt5p and prokaryotic NusG are shown to contain a novel 'NGN' domain. The combined NGN and KOW motif regions of Spt5 form the binding domain with Spt4. Spt5 complexes with Spt4 as a 1:1 heterodimer snf this Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The Schizosaccharomyces pombe core Spt5-Spt4 complex is a heterodimer bearing a trypsin-resistant Spt4-binding domain within the Spt5 subunit.


Pssm-ID: 397481  Cd Length: 84  Bit Score: 101.12  E-value: 1.63e-25
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1904527050  113 IGEERATAISLMRKFIAYQfTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLGywNQQMVPIKEMTDVL 191
Cdd:pfam03439    9 PGQEREVALSLMRKILALA-KTNNLGIYSVFAPDGLKGYIYVEADRQAAVKRALEGIPNVRGL--VPGLVPIKEMEHLL 84
KOW_Spt5_1 cd06081
KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
203-240 4.51e-18

KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240505  Cd Length: 38  Bit Score: 78.28  E-value: 4.51e-18
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 1904527050  203 KSWVRLKRGIYKDDIAQVDYVEPSQNTISLKMIPRIDY 240
Cdd:cd06081      1 GSWVRIKRGIYKGDLAQVDEVDENGNRVVVKLIPRIDY 38
KOW_Spt5_4 cd06084
KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
524-566 5.27e-18

KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240508  Cd Length: 43  Bit Score: 78.33  E-value: 5.27e-18
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|...
gi 1904527050  524 KDIVKVIDGPHSDREGEIRHLYHSFAFLHCKKLVENGGMFVCK 566
Cdd:cd06084      1 GDTVKVVDGPYKGRQGTVLHIYRGTLFLHSREVTENGGIFVVR 43
CTD pfam12815
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
699-757 1.27e-15

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteriztic TPA motif.


Pssm-ID: 372327 [Multi-domain]  Cd Length: 71  Bit Score: 72.48  E-value: 1.27e-15
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  699 GSQTPMYGS--GSRTPMY---GSQTPL--QDGSRTPHY--GSQTPLHD--GSRTPAQSGAWDPnnPNTPS 757
Cdd:pfam12815    1 GSRTPAYNSagGSRTPAWgadGSRTPAygGAGGRTPAYnqGGKTPAWGgaGSRTPAYYGAWGG--SRTPA 68
nusG PRK08559
transcription antitermination protein NusG; Validated
114-243 1.08e-10

transcription antitermination protein NusG; Validated


Pssm-ID: 181467 [Multi-domain]  Cd Length: 153  Bit Score: 61.04  E-value: 1.08e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  114 GEERATAISLMRKFIAYQftdtpLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRlGYwNQQMVPIKEMTDVLKV 193
Cdd:PRK08559    16 GQERNVALMLAMRAKKEN-----LPIYAILAPPELKGYVLVEAESKGAVEEAIRGIPHVR-GV-VPGEISFEEVEHFLKP 88
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 1904527050  194 VKEVANLKPKSWVRLKRGIYKDDIAQVDYVEPSQNTISLKM------IP---RIDYDRI 243
Cdd:PRK08559    89 KPIVEGIKEGDIVELIAGPFKGEKARVVRVDESKEEVTVELleaavpIPvtvRGDQVRV 147
PHA03247 PHA03247
large tegument protein UL36; Provisional
672-889 7.94e-09

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 59.95  E-value: 7.94e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  672 VDRQRLTTVGSRRPGGMTTTYGRTPMYGSQTPMYGSGSRTPMYGSQTPlqdGSRTPHYGSQTPLHDGSRTPAQSGAWDPN 751
Cdd:PHA03247  2661 VSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTP---EPAPHALVSATPLPPGPAAARQASPALPA 2737
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  752 NPNTPsraeeeyeyafddePTPSPQAYGGTPN----PQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPqg 827
Cdd:PHA03247  2738 APAPP--------------AVPAGPATPGGPArparPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSP-- 2801
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1904527050  828 syqPSPSPQSYHQVAPSPAgYQNTHSPASYHPTPSPMAYQASPSPSPVGYSPMTP-GAPSPGG 889
Cdd:PHA03247  2802 ---WDPADPPAAVLAPAAA-LPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLgGSVAPGG 2860
PHA03269 PHA03269
envelope glycoprotein C; Provisional
752-897 1.16e-08

envelope glycoprotein C; Provisional


Pssm-ID: 165527 [Multi-domain]  Cd Length: 566  Bit Score: 58.97  E-value: 1.16e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  752 NPNTPSRAEEEYEYAFDDEPTPSPQayggtPNPQTPGYPDPS-SPQVNPQYNPQtpgtpamyntdqfsPYAAPSPQGSYQ 830
Cdd:PHA03269    21 NLNTNIPIPELHTSAATQKPDPAPA-----PHQAASRAPDPAvAPTSAASRKPD--------------LAQAPTPAASEK 81
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1904527050  831 PSPSPQSYHQV--APSPAGYQNTHSPasyhPTPSPM-----AYQASPSPSPVGYSPMTPgAPSPGGYNPHTPGS 897
Cdd:PHA03269    82 FDPAPAPHQAAsrAPDPAVAPQLAAA----PKPDAAeaftsAAQAHEAPADAGTSAASK-KPDPAAHTQHSPPP 150
KOW_elon_Spt5 TIGR00405
transcription elongation factor Spt5; This protein contains a KOW domain, shared by bacterial ...
113-235 9.59e-08

transcription elongation factor Spt5; This protein contains a KOW domain, shared by bacterial NusG and the uL24 (previously L24p/L26e) family of ribosomal proteins. The most recent papers and crystal structures make this a transcription elongation factor rather than a ribosomal protein.


Pssm-ID: 129499 [Multi-domain]  Cd Length: 145  Bit Score: 52.20  E-value: 9.59e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  113 IGEERATAislmrKFIAYQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRlgywnqQMVP----IKEMT 188
Cdd:TIGR00405    7 VGQEKNVA-----RLMARKARKSGLEVYSILAPESLKGYILVEAETKIDMRNPIIGVPHVR------GVVEgeidFEEIE 75
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*..
gi 1904527050  189 DVLKVVKEVANLKPKSWVRLKRGIYKDDIAQVDYVEPSQNTISLKMI 235
Cdd:TIGR00405   76 RFLTPKKIIESIKKGDIVEIISGPFKGERAKVIRVDESKEEVTLELI 122
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
771-897 9.64e-08

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 56.31  E-value: 9.64e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  771 PTPSPQAYGGTPNPQTPGYPDPSSPQVN-PQYNPQTPGTP--AMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAG 847
Cdd:pfam03154  188 PPGTTQAATAGPTPSAPSVPPQGSPATSqPPNQTQSTAAPhtLIQQTPTLHPQRLPSPHPPLQPMTQPPPPSQVSPQPLP 267
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1904527050  848 YQNTHSPASYHP---------TPSPMAYQASPSPSPVGYSPMTPG----APSPGGYNPHTPGS 897
Cdd:pfam03154  268 QPSLHGQMPPMPhslqtgpshMQHPVPPQPFPLTPQSSQSQVPPGpspaAPGQSQQRIHTPPS 330
NGN smart00738
In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold; In Spt5p, ...
113-193 1.09e-07

In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold; In Spt5p, this domain may confer affinity for Spt4p.Spt4p


Pssm-ID: 197850 [Multi-domain]  Cd Length: 106  Bit Score: 50.83  E-value: 1.09e-07
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050   113 IGEERATAISLMRKFIAYQFTDtplQIKSVVAP-EHVK----------------GYIYVEAYKQTHVKQAIEGV----GN 171
Cdd:smart00738    9 SGQEKRVAENLERKAEALGLED---KIVSILVPtEEVKeirrgkkkvverklfpGYIFVEADLEDEVWTAIRGTpgvrGF 85
                            90       100
                    ....*....|....*....|..
gi 1904527050   172 LRLGYWnQQMVPIKEMTDVLKV 193
Cdd:smart00738   86 VGGGGK-PTPVPDDEIEKILKP 106
PBP1 COG5180
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ...
716-904 3.18e-07

PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];


Pssm-ID: 444064 [Multi-domain]  Cd Length: 548  Bit Score: 54.30  E-value: 3.18e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  716 SQTPL---QDGSRTPHYGSQTPLHDGSRTPaQSGAWDPNNPNTPSRAEEEYEYAFDD-EPTPSPQAYGGTPNP----QTP 787
Cdd:COG5180    195 SPEKLdrpKVEVKDEAQEEPPDLTGGADHP-RPEAASSPKVDPPSTSEARSRPATVDaQPEMRPPADAKERRRaaigDTP 273
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  788 GYPDPSSPQVNPQYNPQT--PGTPAMYNTDQFSPYAAPSPQGSYQPSPS-----PQSYHQVAPSPAGYQNTHSPASYHPT 860
Cdd:COG5180    274 AAEPPGLPVLEAGSEPQSdaPEAETARPIDVKGVASAPPATRPVRPPGGardpgTPRPGQPTERPAGVPEAASDAGQPPS 353
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*...
gi 1904527050  861 PSPMAYQASPSpspvgySPMTPGAPSPG--GYN--PHTPGSGIEQNSS 904
Cdd:COG5180    354 AYPPAEEAVPG------KPLEQGAPRPGssGGDgaPFQPPNGAPQPGL 395
KOW smart00739
KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.
398-425 1.99e-05

KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.


Pssm-ID: 128978  Cd Length: 28  Bit Score: 42.32  E-value: 1.99e-05
                            10        20
                    ....*....|....*....|....*...
gi 1904527050   398 YFKMGDHVKVIAGRFEGDTGLIVRVEEN 425
Cdd:smart00739    1 KFEVGDTVRVIAGPFKGKVGKVLEVDGE 28
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
633-664 3.23e-05

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 41.60  E-value: 3.23e-05
                           10        20        30
                   ....*....|....*....|....*....|..
gi 1904527050  633 IGQTVRISQGPYKGYIGVVKDATESTARVELH 664
Cdd:pfam00467    1 KGDVVRVIAGPFKGKVGKVVEVDDKKNRVLVE 32
SP7_N cd22542
N-terminal domain of transcription factor Specificity Protein (SP) 7; Specificity Proteins ...
729-897 4.99e-05

N-terminal domain of transcription factor Specificity Protein (SP) 7; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP7, also called Osterix (Osx) in humans, is highly conserved among bone-forming vertebrates. It plays a major role, along with Runx2 and Dlx5 in driving the differentiation of mesenchymal precursor cells into osteoblasts and eventually osteocytes. SP7 also plays a regulatory role by inhibiting chondrocyte differentiation, maintaining the balance between differentiation of mesenchymal precursor cells into ossified bone or cartilage. Mutations of this gene have been associated with multiple dysfunctional bone phenotypes in vertebrates. SP7 is thought to play a role in diseases such as Osteogenesis imperfecta. SP7 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP7.


Pssm-ID: 411691 [Multi-domain]  Cd Length: 297  Bit Score: 46.43  E-value: 4.99e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  729 YGSQTPLHDgSRTPAQSGAWDPNNPNTP--------SRAEE----EYEYAFDD-----EPTPSPQA---YGGTPNPQTPG 788
Cdd:cd22542     26 FGGSSPIRD-SATPGKPGNNPGKKPYSLgsdlssakSRSSElmgdSYTATFSSgnglmSPSGSPQAsttYGNDYNPFSHS 104
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  789 YPDPSSPQ----VNPQYNPQTPGTPAMYNT-DQFSPY-----AAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYH 858
Cdd:cd22542    105 FPTSSGSQdpslLVSKGHPSADCLPSVYTSlDMAHPYgswykTGIHPGISSSSTNATASWWDMHSNTNWLSAQGQPDGLQ 184
                          170       180       190
                   ....*....|....*....|....*....|....*....
gi 1904527050  859 PTPSPMAYQASPSPSPVGYSPMTPgaPSPGGYNPHTPGS 897
Cdd:cd22542    185 ASLQPVPAQTPLNPQLPSYTEFTT--LNPAPYPAVGISS 221
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
751-887 9.36e-05

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 46.30  E-value: 9.36e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  751 NNPNTPSRAEEEYEYAFDD-EPTPSPQAYGGTPN-PQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQG- 827
Cdd:NF033839   249 DNVNTKVEIENTVHKIFADmDAVVTKFKKGLTQDtPKEPGNKKPSAPKPGMQPSPQPEKKEVKPEPETPKPEVKPQLEKp 328
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1904527050  828 SYQPSPSPQSYH-QVAPSPAGYQNTHSPASYHPTPSPMAYQASPSPSpvgySPMTPGAPSP 887
Cdd:NF033839   329 KPEVKPQPEKPKpEVKPQLETPKPEVKPQPEKPKPEVKPQPEKPKPE----VKPQPETPKP 385
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
402-430 1.82e-04

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 39.68  E-value: 1.82e-04
                           10        20        30
                   ....*....|....*....|....*....|.
gi 1904527050  402 GDHVKVIAGRFEGDTGLIVRVEE--NFIILF 430
Cdd:pfam00467    2 GDVVRVIAGPFKGKVGKVVEVDDkkNRVLVE 32
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
769-910 1.56e-03

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 42.45  E-value: 1.56e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  769 DEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQynPQTPGTPAMYNTDQFSPYAAPSPQ-GSYQPSPSPQSYH-QVAPSPA 846
Cdd:NF033839   370 EKPKPEVKPQPETPKPEVKPQPEKPKPEVKPQ--PEKPKPEVKPQPEKPKPEVKPQPEkPKPEVKPQPEKPKpEVKPQPE 447
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  847 GYQNTHSPASYHPTPSPMAYQASPSPSpVGYSPMTP----GAPSPGGYNPHTPG--SGIEQNSSDWVTTD 910
Cdd:NF033839   448 KPKPEVKPQPETPKPEVKPQPEKPKPE-VKPQPEKPkpdnSKPQADDKKPSTPNnlSKDKQPSNQASTNE 516
KLF1_2_4_N cd21972
N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins; Kruppel ...
756-895 2.85e-03

N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins; Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the related N-terminal domains of KLF1, KLF2, KLF4, and similar proteins.


Pssm-ID: 409230 [Multi-domain]  Cd Length: 194  Bit Score: 39.97  E-value: 2.85e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  756 PSRAEEEYEYAFDDEPTPSPQAYG----GTPNPQTPGYPDPSSPQVNPQYN-PQTPGTPAMYNTDQFSPYAAPSPQGSYQ 830
Cdd:cd21972     22 LDLEFILSNTVTSDNDNPPPPDPAypppESPESCSTVYDSDGCHPTPNAYCgPNGPGLPGHFLLAGNSPNLGPKIKTENQ 101
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1904527050  831 PS-------PSPQSYHQVAPS------PAGYQNTHSPASYHPTPSPMAYQASPSPSPVGYSPmtPGAPSPGGYNPHTP 895
Cdd:cd21972    102 EQacmpvagYSGHYGPREPQRvppappPPQYAGHFQYHGHFNMFSPPLRANHPGMSTVMLTP--LSTPPLGFLSPEEA 177
KOW smart00739
KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.
346-373 8.32e-03

KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.


Pssm-ID: 128978  Cd Length: 28  Bit Score: 34.61  E-value: 8.32e-03
                            10        20
                    ....*....|....*....|....*...
gi 1904527050   346 NFQPGDNVEVCEGELINLQGKVLSVDGN 373
Cdd:smart00739    1 KFEVGDTVRVIAGPFKGKVGKVLEVDGE 28
 
Name Accession Description Interval E-value
NGN_Euk cd09888
Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW ...
113-192 1.12e-37

Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW domain-containing Transcription Factor 1); The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus. Spt5 forms an Spt4-Spt5 complex that is an essential RNA polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli, and has a variety of functions such as its involvement in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. Spt5-like is homologous to the Spt5 proteins present in all eukaryotes, which is unique as it encodes a protein with an additional long carboxy-terminal extension that contains WG/GW motifs. Spt5-like, or KTF1 (KOW domain-containing Transcription Factor 1), is a RNA-directed DNA methylation (RdDM) pathway effector in plants.


Pssm-ID: 193577 [Multi-domain]  Cd Length: 86  Bit Score: 135.74  E-value: 1.12e-37
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  113 IGEERATAISLMRKFIAYQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLgyWNQQMVPIKEMTDVLK 192
Cdd:cd09888      9 PGKEREIVISLMRKFLDLQRTGNPLGIKSVFARDGLKGYIYIEARKEAHVKDAIEGLRGVYL--NTIKLVPIKEMPDVLS 86
CTD smart01104
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
699-816 1.15e-29

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif.


Pssm-ID: 215026 [Multi-domain]  Cd Length: 121  Bit Score: 114.16  E-value: 1.15e-29
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050   699 GSQTPMYG-SGSRTPMYGSQTP----LQDGSRTPHYGSQTPLHDG--SRTPAQSGAWdPNNPNTPSRAEEEYEYAFDDEP 771
Cdd:smart01104    1 GGRTPAWGaSGSKTPAWGSRTPgtaaGGAPTARGGSGSRTPAWGGagSRTPAWGGAG-PTGSRTPAWGGASAWGNKSSEG 79
                            90       100       110       120
                    ....*....|....*....|....*....|....*....|....*..
gi 1904527050   772 TPSPQA--YGGTPNPQTPGYpdpssPQVNPQYNPQTPGTPAMYNTDQ 816
Cdd:smart01104   80 SASSWAagPGGAYGAPTPGY-----GGTPSAYGPATPGGGAMAGSAS 121
KOW_Spt5_3 cd06083
KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
398-448 2.58e-27

KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240507  Cd Length: 51  Bit Score: 104.91  E-value: 2.58e-27
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1904527050  398 YFKMGDHVKVIAGRFEGDTGLIVRVEENFIILFSDLTMHELKVLPRDLQLC 448
Cdd:cd06083      1 HFKVGDHVKVISGRHEGETGLVVKVEDDVVTVFSDLTMRELKVFPRDLQLS 51
KOW_Spt5_6 cd06086
KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
955-1011 7.03e-27

KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240510  Cd Length: 58  Bit Score: 103.75  E-value: 7.03e-27
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 1904527050  955 EHLEPITPTKNNKVKVILGEDREATGVLLSIDGEDGIVRMDLDEQLKILNLRFLGKL 1011
Cdd:cd06086      1 EHLEPVPPEKGDRVKVIKGEDRGSTGELISIDGADGIVKMDSDGDIKILPMNFLAKL 57
KOW_Spt5_2 cd06082
KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
347-397 1.31e-26

KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240506  Cd Length: 51  Bit Score: 102.97  E-value: 1.31e-26
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1904527050  347 FQPGDNVEVCEGELINLQGKVLSVDGNKITIMPKHEDLKDMLEFPAQELRK 397
Cdd:cd06082      1 FQPGDNVEVIEGELKGLQGKVESVDGDIVTIMPKHEDLKEPLEFPAKELRK 51
KOW_Spt5_5 cd06085
KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
629-678 1.26e-25

KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240509  Cd Length: 52  Bit Score: 100.25  E-value: 1.26e-25
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|
gi 1904527050  629 DNELIGQTVRISQGPYKGYIGVVKDATESTARVELHSTCQTISVDRQRLT 678
Cdd:cd06085      2 RDPLIGKTVRIRKGPYKGYIGIVKDATGTTARVELHSKNKTITVDRSRLA 51
Spt5-NGN pfam03439
Early transcription elongation factor of RNA pol II, NGN section; Spt5p and prokaryotic NusG ...
113-191 1.63e-25

Early transcription elongation factor of RNA pol II, NGN section; Spt5p and prokaryotic NusG are shown to contain a novel 'NGN' domain. The combined NGN and KOW motif regions of Spt5 form the binding domain with Spt4. Spt5 complexes with Spt4 as a 1:1 heterodimer snf this Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The Schizosaccharomyces pombe core Spt5-Spt4 complex is a heterodimer bearing a trypsin-resistant Spt4-binding domain within the Spt5 subunit.


Pssm-ID: 397481  Cd Length: 84  Bit Score: 101.12  E-value: 1.63e-25
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1904527050  113 IGEERATAISLMRKFIAYQfTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLGywNQQMVPIKEMTDVL 191
Cdd:pfam03439    9 PGQEREVALSLMRKILALA-KTNNLGIYSVFAPDGLKGYIYVEADRQAAVKRALEGIPNVRGL--VPGLVPIKEMEHLL 84
KOW_Spt5_1 cd06081
KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
203-240 4.51e-18

KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240505  Cd Length: 38  Bit Score: 78.28  E-value: 4.51e-18
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 1904527050  203 KSWVRLKRGIYKDDIAQVDYVEPSQNTISLKMIPRIDY 240
Cdd:cd06081      1 GSWVRIKRGIYKGDLAQVDEVDENGNRVVVKLIPRIDY 38
KOW_Spt5_4 cd06084
KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
524-566 5.27e-18

KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240508  Cd Length: 43  Bit Score: 78.33  E-value: 5.27e-18
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|...
gi 1904527050  524 KDIVKVIDGPHSDREGEIRHLYHSFAFLHCKKLVENGGMFVCK 566
Cdd:cd06084      1 GDTVKVVDGPYKGRQGTVLHIYRGTLFLHSREVTENGGIFVVR 43
CTD pfam12815
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
699-757 1.27e-15

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteriztic TPA motif.


Pssm-ID: 372327 [Multi-domain]  Cd Length: 71  Bit Score: 72.48  E-value: 1.27e-15
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  699 GSQTPMYGS--GSRTPMY---GSQTPL--QDGSRTPHY--GSQTPLHD--GSRTPAQSGAWDPnnPNTPS 757
Cdd:pfam12815    1 GSRTPAYNSagGSRTPAWgadGSRTPAygGAGGRTPAYnqGGKTPAWGgaGSRTPAYYGAWGG--SRTPA 68
nusG PRK08559
transcription antitermination protein NusG; Validated
114-243 1.08e-10

transcription antitermination protein NusG; Validated


Pssm-ID: 181467 [Multi-domain]  Cd Length: 153  Bit Score: 61.04  E-value: 1.08e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  114 GEERATAISLMRKFIAYQftdtpLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRlGYwNQQMVPIKEMTDVLKV 193
Cdd:PRK08559    16 GQERNVALMLAMRAKKEN-----LPIYAILAPPELKGYVLVEAESKGAVEEAIRGIPHVR-GV-VPGEISFEEVEHFLKP 88
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 1904527050  194 VKEVANLKPKSWVRLKRGIYKDDIAQVDYVEPSQNTISLKM------IP---RIDYDRI 243
Cdd:PRK08559    89 KPIVEGIKEGDIVELIAGPFKGEKARVVRVDESKEEVTVELleaavpIPvtvRGDQVRV 147
CTD pfam12815
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
681-746 8.68e-10

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteriztic TPA motif.


Pssm-ID: 372327 [Multi-domain]  Cd Length: 71  Bit Score: 55.91  E-value: 8.68e-10
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1904527050  681 GSRRPGGMTTTYGRTPMY-------------GSQTPMYGSGSRTPMYGsqtplQDGSRTPHYGSQTplhDGSRTPAQSG 746
Cdd:pfam12815    1 GSRTPAYNSAGGSRTPAWgadgsrtpayggaGGRTPAYNQGGKTPAWG-----GAGSRTPAYYGAW---GGSRTPAYGG 71
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
716-897 3.85e-09

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 60.94  E-value: 3.85e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  716 SQTPLQDGSRTPHYGSQTPLHDgsrtPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPtPSPQAYGGTPNPQTPGYPdPSSP 795
Cdd:pfam03154  259 SQVSPQPLPQPSLHGQMPPMPH----SLQTGPSHMQHPVPPQPFPLTPQSSQSQVP-PGPSPAAPGQSQQRIHTP-PSQS 332
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  796 QVNPQYNP-QTPGTPAMYNTdqfsPYAAPSPQGSYQPSPSPQSY----HQVAPSPAGYQN---------------THSPA 855
Cdd:pfam03154  333 QLQSQQPPrEQPLPPAPLSM----PHIKPPPTTPIPQLPNPQSHkhppHLSGPSPFQMNSnlppppalkplsslsTHHPP 408
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*
gi 1904527050  856 SYHPTPSPMAYQASPSPSPVGYSPM---TPGAPSPGGYNPHTPGS 897
Cdd:pfam03154  409 SAHPPPLQLMPQSQQLPPPPAQPPVltqSQSLPPPAASHPPTSGL 453
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
654-973 3.98e-09

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 60.36  E-value: 3.98e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  654 ATESTARVELHSTCQTISvdrqrlTTVGSRRP--GGMTTTYGRTPMYGSQTPMYGSGsrTPMYGSQTPlQDGSRTPHYGS 731
Cdd:pfam17823  170 AASPAPRTAASSTTAASS------TTAASSAPttAASSAPATLTPARGISTAATATG--HPAAGTALA-AVGNSSPAAGT 240
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  732 QTPLhDGSRTPAQSGawdpnnpnTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGypDPSSPQVNPQYNPQTPGTPAM 811
Cdd:pfam17823  241 VTAA-VGTVTPAALA--------TLAAAAGTVASAAGTINMGDPHARRLSPAKHMPS--DTMARNPAAPMGAQAQGPIIQ 309
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  812 YNTDQfsPYAAPSPqgsyQPSPSPQSYHQVAPSPAGYQNTHSPASyhPTPSPMAYQASPSPSPVGYSPMTPGA------- 884
Cdd:pfam17823  310 VSTDQ--PVHNTAG----EPTPSPSNTTLEPNTPKSVASTNLAVV--TTTKAQAKEPSASPVPVLHTSMIPEVeatsptt 381
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  885 -PSPGGYNPHTPGSGIEQNSSdwvttdiQVKVRDTyLDTQVVGQT----GVIRSVTGGMCSVYLKDSEKVVSIssehlEP 959
Cdd:pfam17823  382 qPSPLLPTQGAAGPGILLAPE-------QVATEAT-AGTASAGPTprssGDPKTLAMASCQLSTQGQYLVVTT-----DP 448
                          330
                   ....*....|....*...
gi 1904527050  960 ITPTKNNK----VKVILG 973
Cdd:pfam17823  449 LTPALVDKmfllVVLILG 466
PHA03247 PHA03247
large tegument protein UL36; Provisional
672-889 7.94e-09

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 59.95  E-value: 7.94e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  672 VDRQRLTTVGSRRPGGMTTTYGRTPMYGSQTPMYGSGSRTPMYGSQTPlqdGSRTPHYGSQTPLHDGSRTPAQSGAWDPN 751
Cdd:PHA03247  2661 VSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTP---EPAPHALVSATPLPPGPAAARQASPALPA 2737
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  752 NPNTPsraeeeyeyafddePTPSPQAYGGTPN----PQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPqg 827
Cdd:PHA03247  2738 APAPP--------------AVPAGPATPGGPArparPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSP-- 2801
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1904527050  828 syqPSPSPQSYHQVAPSPAgYQNTHSPASYHPTPSPMAYQASPSPSPVGYSPMTP-GAPSPGG 889
Cdd:PHA03247  2802 ---WDPADPPAAVLAPAAA-LPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLgGSVAPGG 2860
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
703-904 9.40e-09

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 59.78  E-value: 9.40e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  703 PMYGSGSRTPMYGSQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPsrAEEEYEYAFDDEPTPSPQayggTP 782
Cdd:pfam03154  294 PPQPFPLTPQSSQSQVPPGPSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPLPP--APLSMPHIKPPPTTPIPQ----LP 367
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  783 NPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQgsyqPSP---SPQSyHQVAPSPAG------YQNTHS 853
Cdd:pfam03154  368 NPQSHKHPPHLSGPSPFQMNSNLPPPPALKPLSSLSTHHPPSAH----PPPlqlMPQS-QQLPPPPAQppvltqSQSLPP 442
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 1904527050  854 PASYHPTPSpmAYQASPSPSPVGYSPMTPGAP----SPGGYNPHTP--GSGIEQNSS 904
Cdd:pfam03154  443 PAASHPPTS--GLHQVPSQSPFPQHPFVPGGPppitPPSGPPTSTSsaMPGIQPPSS 497
PHA03269 PHA03269
envelope glycoprotein C; Provisional
752-897 1.16e-08

envelope glycoprotein C; Provisional


Pssm-ID: 165527 [Multi-domain]  Cd Length: 566  Bit Score: 58.97  E-value: 1.16e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  752 NPNTPSRAEEEYEYAFDDEPTPSPQayggtPNPQTPGYPDPS-SPQVNPQYNPQtpgtpamyntdqfsPYAAPSPQGSYQ 830
Cdd:PHA03269    21 NLNTNIPIPELHTSAATQKPDPAPA-----PHQAASRAPDPAvAPTSAASRKPD--------------LAQAPTPAASEK 81
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1904527050  831 PSPSPQSYHQV--APSPAGYQNTHSPasyhPTPSPM-----AYQASPSPSPVGYSPMTPgAPSPGGYNPHTPGS 897
Cdd:PHA03269    82 FDPAPAPHQAAsrAPDPAVAPQLAAA----PKPDAAeaftsAAQAHEAPADAGTSAASK-KPDPAAHTQHSPPP 150
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
685-898 2.63e-08

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 58.10  E-value: 2.63e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  685 PGGMTTTYGRTPMYGSQTPMYGSGSRTPMYGSQ-------TPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNP---- 753
Cdd:pfam09606  231 PQQMGGAPNQVAMQQQQPQQQGQQSQLGMGINQmqqmpqgVGGGAGQGGPGQPMGPPGQQPGAMPNVMSIGDQNNYqqqq 310
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  754 --NTPSRAEEEYEYAFDDEPTPS--------PQAYGGTPNPQTPGypdpssPQVNPQYNPQTPGTPAMYNTDQFSPYAAP 823
Cdd:pfam09606  311 trQQQQQQGGNHPAAHQQQMNQSvgqggqvvALGGLNHLETWNPG------NFGGLGANPMQRGQPGMMSSPSPVPGQQV 384
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1904527050  824 SPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASyHPTPSPmAYQASPSPSPVGY--SPMTPGAPSPGGyNPHTPGSG 898
Cdd:pfam09606  385 RQVTPNQFMRQSPQPSVPSPQGPGSQPPQSHPG-GMIPSP-ALIPSPSPQMSQQpaQQRTIGQDSPGG-SLNTPGQS 458
NGN cd08000
N-Utilization Substance G (NusG) N-terminal (NGN) domain Superfamily; The N-Utilization ...
109-191 3.51e-08

N-Utilization Substance G (NusG) N-terminal (NGN) domain Superfamily; The N-Utilization Substance G (NusG) and its eukaryotic homolog Spt5 are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus in bacteria and archaea. The eukaryotic ortholog, Spt5, is a large protein composed of an acidic N-terminus, an NGN domain, and multiple KOW motifs at its C-terminus. Spt5 forms a Spt4-Spt5 complex that is an essential RNA Polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli and has a variety of functions, such as being involved in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but its functions and requirements are different. The diverse activities suggest that, after diverging from a common ancestor, NusG proteins became specialized in different bacteria.


Pssm-ID: 193574 [Multi-domain]  Cd Length: 99  Bit Score: 51.94  E-value: 3.51e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  109 LFFQIGEERATAISLMRKFIA---------YQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLGYWN- 178
Cdd:cd08000      5 LFVKTGREEKVEKLLEKRFEAndieafvpkKEVPERKRGKIEEVIKPLFPGYVFVETDLSPELYELIREVPGVIGILGNg 84
                           90
                   ....*....|....*
gi 1904527050  179 --QQMVPIKEMTDVL 191
Cdd:cd08000     85 eePSPVSDEEIEMIL 99
KOW_elon_Spt5 TIGR00405
transcription elongation factor Spt5; This protein contains a KOW domain, shared by bacterial ...
113-235 9.59e-08

transcription elongation factor Spt5; This protein contains a KOW domain, shared by bacterial NusG and the uL24 (previously L24p/L26e) family of ribosomal proteins. The most recent papers and crystal structures make this a transcription elongation factor rather than a ribosomal protein.


Pssm-ID: 129499 [Multi-domain]  Cd Length: 145  Bit Score: 52.20  E-value: 9.59e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  113 IGEERATAislmrKFIAYQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRlgywnqQMVP----IKEMT 188
Cdd:TIGR00405    7 VGQEKNVA-----RLMARKARKSGLEVYSILAPESLKGYILVEAETKIDMRNPIIGVPHVR------GVVEgeidFEEIE 75
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*..
gi 1904527050  189 DVLKVVKEVANLKPKSWVRLKRGIYKDDIAQVDYVEPSQNTISLKMI 235
Cdd:TIGR00405   76 RFLTPKKIIESIKKGDIVEIISGPFKGERAKVIRVDESKEEVTLELI 122
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
771-897 9.64e-08

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 56.31  E-value: 9.64e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  771 PTPSPQAYGGTPNPQTPGYPDPSSPQVN-PQYNPQTPGTP--AMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAG 847
Cdd:pfam03154  188 PPGTTQAATAGPTPSAPSVPPQGSPATSqPPNQTQSTAAPhtLIQQTPTLHPQRLPSPHPPLQPMTQPPPPSQVSPQPLP 267
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1904527050  848 YQNTHSPASYHP---------TPSPMAYQASPSPSPVGYSPMTPG----APSPGGYNPHTPGS 897
Cdd:pfam03154  268 QPSLHGQMPPMPhslqtgpshMQHPVPPQPFPLTPQSSQSQVPPGpspaAPGQSQQRIHTPPS 330
NGN smart00738
In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold; In Spt5p, ...
113-193 1.09e-07

In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold; In Spt5p, this domain may confer affinity for Spt4p.Spt4p


Pssm-ID: 197850 [Multi-domain]  Cd Length: 106  Bit Score: 50.83  E-value: 1.09e-07
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050   113 IGEERATAISLMRKFIAYQFTDtplQIKSVVAP-EHVK----------------GYIYVEAYKQTHVKQAIEGV----GN 171
Cdd:smart00738    9 SGQEKRVAENLERKAEALGLED---KIVSILVPtEEVKeirrgkkkvverklfpGYIFVEADLEDEVWTAIRGTpgvrGF 85
                            90       100
                    ....*....|....*....|..
gi 1904527050   172 LRLGYWnQQMVPIKEMTDVLKV 193
Cdd:smart00738   86 VGGGGK-PTPVPDDEIEKILKP 106
PHA03378 PHA03378
EBNA-3B; Provisional
657-889 1.28e-07

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 55.84  E-value: 1.28e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  657 STARVELHSTCQTISvDRQRLTTVGSRRPGGMTTTygrtpmygSQTPMYGSGSRTPMYGSQTPLQD-----------GSR 725
Cdd:PHA03378   579 SPTTSQLASSAPSYA-QTPWPVPHPSQTPEPPTTQ--------SHIPETSAPRQWPMPLRPIPMRPlrmqpitfnvlVFP 649
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  726 TPHYGSQTPLHDGSRTPAQSGAWdPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPgyPDPSS-PQVNPQYNPQ 804
Cdd:PHA03378   650 TPHQPPQVEITPYKPTWTQIGHI-PYQPSPTGANTMLPIQWAPGTMQPPPRAPTPMRPPAAP--PGRAQrPAAATGRARP 726
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  805 TPGTPAMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGyqnthSPASYHPTPSPMA-------YQASPSPSP--- 874
Cdd:PHA03378   727 PAAAPGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAA-----APGAPTPQPPPQAppapqqrPRGAPTPQPppq 801
                          250
                   ....*....|....*
gi 1904527050  875 VGYSPMTPGAPSPGG 889
Cdd:PHA03378   802 AGPTSMQLMPRAAPG 816
KOW cd00380
KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known ...
350-396 1.31e-07

KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues.


Pssm-ID: 240504  Cd Length: 49  Bit Score: 48.75  E-value: 1.31e-07
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*....
gi 1904527050  350 GDNVEVCEGELINLQGKVLSVDG--NKITIMPKHEDLKDMLEFPAQELR 396
Cdd:cd00380      1 GDVVRVLRGPYKGREGVVVDIDPrfGIVTVKGATGSKGAELKVRFDDVD 49
PHA03247 PHA03247
large tegument protein UL36; Provisional
689-909 2.08e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 55.33  E-value: 2.08e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  689 TTTYGRTPMYGSQTPMygsgSRTPMYGSQTPLQDGSRTPHYGSQTPLHDGSR-TPAQSGAWDPNNPNTPsRAEEEYEYAF 767
Cdd:PHA03247  2817 ALPPAASPAGPLPPPT----SAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRrPPSRSPAAKPAAPARP-PVRRLARPAV 2891
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  768 DDEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQtpgtpamyntdqfsPYAAPSPQGSYQPSPSPQSYHQVAPSPAG 847
Cdd:PHA03247  2892 SRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQ--------------PQPPPPPPPRPQPPLAPTTDPAGAGEPSG 2957
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1904527050  848 YQNTHSPASYHPTPSPMAYQASPSPSPvgySPMTPGAPSPGgyNPHTPGSGIeqnsSDWVTT 909
Cdd:PHA03247  2958 AVPQPWLGALVPGRVAVPRFRVPQPAP---SREAPASSTPP--LTGHSLSRV----SSWASS 3010
PHA03377 PHA03377
EBNA-3C; Provisional
665-895 2.85e-07

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 54.67  E-value: 2.85e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  665 STCQTISVDRQRLTTVGSRRPGGMTTTygrTPMYGSQTPMYgSGSRTPMYGSQTPLQD---GSRTPHYGSQTPLHDGSRT 741
Cdd:PHA03377   686 SVFVLPSVDAGRAQPSEESHLSSMSPT---QPISHEEQPRY-EDPDDPLDLSLHPDQApppSHQAPYSGHEEPQAQQAPY 761
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  742 PaqsGAWDPNNPNTPSRAEEEyeyafddeptpsPQAYGGTPNpQTPGY--PDPSSPQvNPQY--------------NPQT 805
Cdd:PHA03377   762 P---GYWEPRPPQAPYLGYQE------------PQAQGVQVS-SYPGYagPWGLRAQ-HPRYrhswaywsqypghgHPQG 824
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  806 PGTP-AMYNTDQFSPYAAP-----SPQGSYQPSPSP----QSYHQVAPSPAGYQNTHSPASYHPTPS----PMAYQASPS 871
Cdd:PHA03377   825 PWAPrPPHLPPQWDGSAGHgqdqvSQFPHLQSETGPprlqLSQVPQLPYSQTLVSSSAPSWSSPQPRapirPIPTRFPPP 904
                          250       260
                   ....*....|....*....|....
gi 1904527050  872 PSPVGYSpMTPGAPSPGGYNPHTP 895
Cdd:PHA03377   905 PMPLQDS-MAVGCDSSGTACPSMP 927
PBP1 COG5180
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ...
716-904 3.18e-07

PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];


Pssm-ID: 444064 [Multi-domain]  Cd Length: 548  Bit Score: 54.30  E-value: 3.18e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  716 SQTPL---QDGSRTPHYGSQTPLHDGSRTPaQSGAWDPNNPNTPSRAEEEYEYAFDD-EPTPSPQAYGGTPNP----QTP 787
Cdd:COG5180    195 SPEKLdrpKVEVKDEAQEEPPDLTGGADHP-RPEAASSPKVDPPSTSEARSRPATVDaQPEMRPPADAKERRRaaigDTP 273
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  788 GYPDPSSPQVNPQYNPQT--PGTPAMYNTDQFSPYAAPSPQGSYQPSPS-----PQSYHQVAPSPAGYQNTHSPASYHPT 860
Cdd:COG5180    274 AAEPPGLPVLEAGSEPQSdaPEAETARPIDVKGVASAPPATRPVRPPGGardpgTPRPGQPTERPAGVPEAASDAGQPPS 353
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*...
gi 1904527050  861 PSPMAYQASPSpspvgySPMTPGAPSPG--GYN--PHTPGSGIEQNSS 904
Cdd:COG5180    354 AYPPAEEAVPG------KPLEQGAPRPGssGGDgaPFQPPNGAPQPGL 395
PRK10263 PRK10263
DNA translocase FtsK; Provisional
770-896 7.14e-07

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 53.55  E-value: 7.14e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  770 EPTPSPQAYGGTPNPQtpgYPDPSSPQVNP---QYNPQTPGTPAMYNTDQFSPYAAPSP-QGSYQPSPSPQSYHQVAPSP 845
Cdd:PRK10263   370 EPVIAPAPEGYPQQSQ---YAQPAVQYNEPlqqPVQPQQPYYAPAAEQPAQQPYYAPAPeQPAQQPYYAPAPEQPVAGNA 446
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1904527050  846 AGYQNTHSPasYHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTPG 896
Cdd:PRK10263   447 WQAEEQQST--FAPQSTYQTEQTYQQPAAQEPLYQQPQPVEQQPVVEPEPV 495
PHA03247 PHA03247
large tegument protein UL36; Provisional
716-896 7.38e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 53.79  E-value: 7.38e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  716 SQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYP----- 790
Cdd:PHA03247  2567 SVPPPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDPHppptv 2646
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  791 --------DPSSPQVNPQYNPQTPGTPAMYN--TDQFSPYAAPSPQGSYQPS---PSPQSYHQVAPSPAGYQNTHSPASY 857
Cdd:PHA03247  2647 ppperprdDPAPGRVSRPRRARRLGRAAQASspPQRPRRRAARPTVGSLTSLadpPPPPPTPEPAPHALVSATPLPPGPA 2726
                          170       180       190
                   ....*....|....*....|....*....|....*....
gi 1904527050  858 HPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTPG 896
Cdd:PHA03247  2727 AARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAG 2765
PHA03378 PHA03378
EBNA-3B; Provisional
726-895 1.11e-06

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 52.76  E-value: 1.11e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  726 TPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEE---YEYAFDDEPTPSPQAYGGTPNPQTPGYPDPS-SPQVNPQ- 800
Cdd:PHA03378   582 TSQLASSAPSYAQTPWPVPHPSQTPEPPTTQSHIPETsapRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHqPPQVEITp 661
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  801 ------------YNPQTPG---------TPAMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHP 859
Cdd:PHA03378   662 ykptwtqighipYQPSPTGantmlpiqwAPGTMQPPPRAPTPMRPPAAPPGRAQRPAAATGRARPPAAAPGRARPPAAAP 741
                          170       180       190
                   ....*....|....*....|....*....|....*...
gi 1904527050  860 TPS--PMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTP 895
Cdd:PHA03378   742 GRArpPAAAPGRARPPAAAPGRARPPAAAPGAPTPQPP 779
KOW cd00380
KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known ...
634-677 1.58e-06

KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues.


Pssm-ID: 240504  Cd Length: 49  Bit Score: 45.67  E-value: 1.58e-06
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*...
gi 1904527050  634 GQTVRISQGPYKGYIGVVKDATEST--ARVELH--STCQTISVDRQRL 677
Cdd:cd00380      1 GDVVRVLRGPYKGREGVVVDIDPRFgiVTVKGAtgSKGAELKVRFDDV 48
PRK10263 PRK10263
DNA translocase FtsK; Provisional
691-895 3.60e-06

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 51.24  E-value: 3.60e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  691 TYGRTPMYGSQTPMYGSGSRTPMYGSQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRAE-EEYEYAFDD 769
Cdd:PRK10263   379 GYPQQSQYAQPAVQYNEPLQQPVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPAPEQPVAGNAWQaEEQQSTFAP 458
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  770 EPTPSPQAYGGTPNPQTPGY--PDPSSPQVNPQYNPQT----PGTPAMYNTDQFS-------------------PYAAPS 824
Cdd:PRK10263   459 QSTYQTEQTYQQPAAQEPLYqqPQPVEQQPVVEPEPVVeetkPARPPLYYFEEVEekrarereqlaawyqpipePVKEPE 538
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1904527050  825 PQGSYQPSPSPQSYHQVAPSPAGyqnthSPASYHPTPSPMAYQASPSPSPVGYSPMTPGAPSPG---GYNPHTP 895
Cdd:PRK10263   539 PIKSSLKAPSVAAVPPVEAAAAV-----SPLASGVKKATLATGAAATVAAPVFSLANSGGPRPQvkeGIGPQLP 607
KOW cd00380
KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known ...
402-446 3.91e-06

KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues.


Pssm-ID: 240504  Cd Length: 49  Bit Score: 44.90  E-value: 3.91e-06
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*....
gi 1904527050  402 GDHVKVIAGRFEGDTGLIVRVEENF----IILFSDLTMHELKVLPRDLQ 446
Cdd:cd00380      1 GDVVRVLRGPYKGREGVVVDIDPRFgivtVKGATGSKGAELKVRFDDVD 49
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
773-901 4.44e-06

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 50.92  E-value: 4.44e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  773 PSPQAYGGTPNPQTPGYPDPSS-PQVNPqynpqTPGTPAMynTDQFSPYAAPSPQGSyQPSPSPQSYHQVAPS--PAGYQ 849
Cdd:pfam03154  172 PVLQAQSGAASPPSPPPPGTTQaATAGP-----TPSAPSV--PPQGSPATSQPPNQT-QSTAAPHTLIQQTPTlhPQRLP 243
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|...
gi 1904527050  850 NTHSPASYHPTPSPMAY-QASPSPSPVGYSPMTPGAPSPGGYNPHTPGSGIEQ 901
Cdd:pfam03154  244 SPHPPLQPMTQPPPPSQvSPQPLPQPSLHGQMPPMPHSLQTGPSHMQHPVPPQ 296
PHA03247 PHA03247
large tegument protein UL36; Provisional
738-893 4.60e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 51.09  E-value: 4.60e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  738 GSRTPAQSGAWDPNNPNTPSRAEEEYeyaFDDEPTPSP-----------------QAYGGTPNPQTPGYPDPSSPQV--N 798
Cdd:PHA03247  2494 AAPDPGGGGPPDPDAPPAPSRLAPAI---LPDEPVGEPvhprmltwirgleelasDDAGDPPPPLPPAAPPAAPDRSvpP 2570
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  799 PQYNPQTPGtPAMyNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHPTPSPMAYQA-SPSPSPVGY 877
Cdd:PHA03247  2571 PRPAPRPSE-PAV-TSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPdPHPPPTVPP 2648
                          170
                   ....*....|....*.
gi 1904527050  878 SPMTPGAPSPGGYNPH 893
Cdd:PHA03247  2649 PERPRDDPAPGRVSRP 2664
PHA03378 PHA03378
EBNA-3B; Provisional
691-887 5.00e-06

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 50.84  E-value: 5.00e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  691 TYGRTPMYGSQTPMYGSGSRTPMYGSQTP-----------LQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRA 759
Cdd:PHA03378   578 TSPTTSQLASSAPSYAQTPWPVPHPSQTPeppttqshipeTSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHQPPQV 657
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  760 EeeyeyafddeptPSPQAYGGTPNPQTPGYPDPSSPQVN--PQYNPQTPGTPAMYNTDQFSPYAAPS----PQGSYQPSP 833
Cdd:PHA03378   658 E------------ITPYKPTWTQIGHIPYQPSPTGANTMlpIQWAPGTMQPPPRAPTPMRPPAAPPGraqrPAAATGRAR 725
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 1904527050  834 SPQSYHQVAPSPAGYQNTHSPASYHPTPS-PMAYQASPSPSPVGyspmTPGAPSP 887
Cdd:PHA03378   726 PPAAAPGRARPPAAAPGRARPPAAAPGRArPPAAAPGRARPPAA----APGAPTP 776
NGN_Arch cd09887
Archaeal N-Utilization Substance G (NusG) N-terminal (NGN) domain; The N-Utilization Substance ...
113-173 7.04e-06

Archaeal N-Utilization Substance G (NusG) N-terminal (NGN) domain; The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. Transcription in archaea has a eukaryotic-type transcription apparatus, but contains bacterial-type transcription factors. NusG is one of the few archaeal transcription factors that has orthologs in both bacteria and eukaryotes. Archaeal NusG is similar to bacterial NusG, composed of an NGN domain and a Kyrpides Ouzounis and Woese (KOW) repeat. The eukaryotic ortholog, Spt5, is a large protein composed of an acidic N-terminus, an NGN domain, and multiple KOW motifs at its C-terminus. NusG was originally discovered as a N-dependent antitermination enhancing activity in Escherichia coli and has a variety of functions, such as being involved in RNA polymerase elongation and Rho-termination in bacteria. Archaeal NusG forms a complex with DNA-directed RNA polymerase subunit E (rpoE) that is similar to the Spt5-Spt4 complex in eukaryotes.


Pssm-ID: 193576  Cd Length: 82  Bit Score: 44.84  E-value: 7.04e-06
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1904527050  113 IGEERATAISLMRKFiayqfTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLR 173
Cdd:cd09887      9 AGQERNVADLLAMRA-----EKENLDVYSILVPEELKGYVFVEAEDPDRVEELIRGIPHVR 64
PRK10263 PRK10263
DNA translocase FtsK; Provisional
792-903 1.02e-05

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 49.70  E-value: 1.02e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  792 PSSPQVNPQYNPQTpgtpamynTDQFSPYAAPSPQGSYQPSPSPQSYHQ-VAPSPAGYQNTHSPASYHPTPSPMAYQASP 870
Cdd:PRK10263   740 PHEPLFTPIVEPVQ--------QPQQPVAPQQQYQQPQQPVAPQPQYQQpQQPVAPQPQYQQPQQPVAPQPQYQQPQQPV 811
                           90       100       110
                   ....*....|....*....|....*....|...
gi 1904527050  871 SPSPVGYSPMTPGAPSPGGYNPHTPGSGIEQNS 903
Cdd:PRK10263   812 APQPQYQQPQQPVAPQPQYQQPQQPVAPQPQDT 844
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
665-888 1.19e-05

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 49.53  E-value: 1.19e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  665 STCQTISVdrqrlTTVGSRRPGGmtTTYGRTPMYGSQTPM-YGSGSRTPMYGSQTPLQDgSRTPHYGSQTPlhdGSRTPA 743
Cdd:pfam05109  463 STGPTVST-----ADVTSPTPAG--TTSGASPVTPSPSPRdNGTESKAPDMTSPTSAVT-TPTPNATSPTP---AVTTPT 531
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  744 QSGAWDPNNPNTPSRAeeeYEYAFDDEPTPSPQAYGGTPNPQTPGY-------------PDPSSPQV---NPQYNPQ--- 804
Cdd:pfam05109  532 PNATSPTLGKTSPTSA---VTTPTPNATSPTPAVTTPTPNATIPTLgktsptsavttptPNATSPTVgetSPQANTTnht 608
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  805 ---TPGTPAMYNTDQFSPYAAPSPQGSYQPSPS------PQSYHQ-VAPSPAGYQNTHSP--ASYHPTPSPMAYQASPSP 872
Cdd:pfam05109  609 lggTSSTPVVTSPPKNATSAVTTGQHNITSSSTssmslrPSSISEtLSPSTSDNSTSHMPllTSAHPTGGENITQVTPAS 688
                          250
                   ....*....|....*.
gi 1904527050  873 SPVGYSPMTPGAPSPG 888
Cdd:pfam05109  689 TSTHHVSTSSPAPRPG 704
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
718-905 1.60e-05

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 49.01  E-value: 1.60e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  718 TPLQDGSRTPHYGSQ-----------TPLHDGSRTPAqSGAWDPNNPNTPSRAEE-EYEYAFDDEPTPSPQAYGGTPNPQ 785
Cdd:PHA03307    26 ATPGDAADDLLSGSQgqlvsdsaelaAVTVVAGAAAC-DRFEPPTGPPPGPGTEApANESRSTPTWSLSTLAPASPAREG 104
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  786 TPGYPDPSSPqvnpqynpqtPGTPAMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNThSPASYHPTP---- 861
Cdd:PHA03307   105 SPTPPGPSSP----------DPPPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPAAGASPA-AVASDAASSrqaa 173
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*.
gi 1904527050  862 --SPMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTPGSGIEQNSSD 905
Cdd:PHA03307   174 lpLSSPEETARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASS 219
Pro-rich pfam15240
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
743-895 1.97e-05

Proline-rich protein; This family includes several eukaryotic proline-rich proteins.


Pssm-ID: 464580 [Multi-domain]  Cd Length: 167  Bit Score: 46.18  E-value: 1.97e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  743 AQSGAWDPNNPNTPSR-AEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQfsPYA 821
Cdd:pfam15240   16 AQSSSEDVSQEDSPSLiSEEEGQSQQGGQGPQGPPPGGFPPQPPASDDPPGPPPPGGPQQPPPQGGKQKPQGPPP--QGG 93
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1904527050  822 APSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHPTPSPMAYQASPSPSPvGYSPMTPGAPSPGGyNPHTP 895
Cdd:pfam15240   94 PRPPPGKPQGPPPQGGNQQQGPPPPGKPQGPPPQGGGPPPQGGNQQGPPPPPP-GNPQGPPQRPPQPG-NPQGP 165
KOW smart00739
KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.
398-425 1.99e-05

KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.


Pssm-ID: 128978  Cd Length: 28  Bit Score: 42.32  E-value: 1.99e-05
                            10        20
                    ....*....|....*....|....*...
gi 1904527050   398 YFKMGDHVKVIAGRFEGDTGLIVRVEEN 425
Cdd:smart00739    1 KFEVGDTVRVIAGPFKGKVGKVLEVDGE 28
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
770-893 2.69e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 48.06  E-value: 2.69e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  770 EPTPSPQAYGGTPNPQTPGyPDPSSPQVNPQYNPQTPGTPAmyntdqfsPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQ 849
Cdd:PRK07764   391 AGAPAAAAPSAAAAAPAAA-PAPAAAAPAAAAAPAPAAAPQ--------PAPAPAPAPAPPSPAGNAPAGGAPSPPPAAA 461
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....
gi 1904527050  850 NTHSPASYHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGYNPH 893
Cdd:PRK07764   462 PSAQPAPAPAAAPEPTAAPAPAPPAAPAPAAAPAAPAAPAAPAG 505
PHA03247 PHA03247
large tegument protein UL36; Provisional
702-932 3.07e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 48.40  E-value: 3.07e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  702 TPMYGSGSRTPmyGSQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGT 781
Cdd:PHA03247  2741 PPAVPAGPATP--GGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAAL 2818
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  782 PNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFS-----PYAAPSPQGSYQPSPSPQSYHQVA--PSPAGYQNTHSP 854
Cdd:PHA03247  2819 PPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSvapggDVRRRPPSRSPAAKPAAPARPPVRrlARPAVSRSTESF 2898
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1904527050  855 ASYHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTPGSGIEQNSSDWVTTDIQVKVRDTYLDTQVVGQTGVIR 932
Cdd:PHA03247  2899 ALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAVPR 2976
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
633-664 3.23e-05

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 41.60  E-value: 3.23e-05
                           10        20        30
                   ....*....|....*....|....*....|..
gi 1904527050  633 IGQTVRISQGPYKGYIGVVKDATESTARVELH 664
Cdd:pfam00467    1 KGDVVRVIAGPFKGKVGKVVEVDDKKNRVLVE 32
PHA03291 PHA03291
envelope glycoprotein I; Provisional
766-913 4.48e-05

envelope glycoprotein I; Provisional


Pssm-ID: 223033 [Multi-domain]  Cd Length: 401  Bit Score: 46.87  E-value: 4.48e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  766 AFDDEPTPSPQAYGGTPnpqTPGYPDPSSPQVNPQYNPqtpgtpamynTDQFSPyAAPSPQGSYQPSPspqsyhQVAPSP 845
Cdd:PHA03291   165 AFPAEGTLAAPPLGEGS---ADGSCDPALPLSAPRLGP----------ADVFVP-ATPRPTPRTTASP------ETTPTP 224
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1904527050  846 AgyqNTHSPASyHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTPGSGIEQNSSDWVTTDIQV 913
Cdd:PHA03291   225 S---TTTSPPS-TTIPAPSTTIAAPQAGTTPEAEGTPAPPTPGGGEAPPANATPAPEASRYELTVTQI 288
SP7_N cd22542
N-terminal domain of transcription factor Specificity Protein (SP) 7; Specificity Proteins ...
729-897 4.99e-05

N-terminal domain of transcription factor Specificity Protein (SP) 7; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP7, also called Osterix (Osx) in humans, is highly conserved among bone-forming vertebrates. It plays a major role, along with Runx2 and Dlx5 in driving the differentiation of mesenchymal precursor cells into osteoblasts and eventually osteocytes. SP7 also plays a regulatory role by inhibiting chondrocyte differentiation, maintaining the balance between differentiation of mesenchymal precursor cells into ossified bone or cartilage. Mutations of this gene have been associated with multiple dysfunctional bone phenotypes in vertebrates. SP7 is thought to play a role in diseases such as Osteogenesis imperfecta. SP7 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP7.


Pssm-ID: 411691 [Multi-domain]  Cd Length: 297  Bit Score: 46.43  E-value: 4.99e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  729 YGSQTPLHDgSRTPAQSGAWDPNNPNTP--------SRAEE----EYEYAFDD-----EPTPSPQA---YGGTPNPQTPG 788
Cdd:cd22542     26 FGGSSPIRD-SATPGKPGNNPGKKPYSLgsdlssakSRSSElmgdSYTATFSSgnglmSPSGSPQAsttYGNDYNPFSHS 104
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  789 YPDPSSPQ----VNPQYNPQTPGTPAMYNT-DQFSPY-----AAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYH 858
Cdd:cd22542    105 FPTSSGSQdpslLVSKGHPSADCLPSVYTSlDMAHPYgswykTGIHPGISSSSTNATASWWDMHSNTNWLSAQGQPDGLQ 184
                          170       180       190
                   ....*....|....*....|....*....|....*....
gi 1904527050  859 PTPSPMAYQASPSPSPVGYSPMTPgaPSPGGYNPHTPGS 897
Cdd:cd22542    185 ASLQPVPAQTPLNPQLPSYTEFTT--LNPAPYPAVGISS 221
PRK14959 PRK14959
DNA polymerase III subunits gamma and tau; Provisional
771-885 6.93e-05

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 184923 [Multi-domain]  Cd Length: 624  Bit Score: 46.60  E-value: 6.93e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  771 PTPSPQAYGGTPNPQTPGyPDPSSPQVNPQYNPQTPGTPAmyntdqfsPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQN 850
Cdd:PRK14959   387 EGPASGGAATIPTPGTQG-PQGTAPAAGMTPSSAAPATPA--------PSAAPSPRVPWDDAPPAPPRSGIPPRPAPRMP 457
                           90       100       110
                   ....*....|....*....|....*....|....*.
gi 1904527050  851 THSPASYHPTPSPMAYQASPSPS-PVGYSPMTPGAP 885
Cdd:PRK14959   458 EASPVPGAPDSVASASDAPPTLGdPSDTAEHTPSGP 493
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
751-887 9.36e-05

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 46.30  E-value: 9.36e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  751 NNPNTPSRAEEEYEYAFDD-EPTPSPQAYGGTPN-PQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQG- 827
Cdd:NF033839   249 DNVNTKVEIENTVHKIFADmDAVVTKFKKGLTQDtPKEPGNKKPSAPKPGMQPSPQPEKKEVKPEPETPKPEVKPQLEKp 328
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1904527050  828 SYQPSPSPQSYH-QVAPSPAGYQNTHSPASYHPTPSPMAYQASPSPSpvgySPMTPGAPSP 887
Cdd:NF033839   329 KPEVKPQPEKPKpEVKPQLETPKPEVKPQPEKPKPEVKPQPEKPKPE----VKPQPETPKP 385
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
722-910 1.21e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 46.13  E-value: 1.21e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  722 DGSRTPHYGSQTPLHDGSR--TPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQVNP 799
Cdd:PRK07764   596 GGEGPPAPASSGPPEEAARpaAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGG 675
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  800 QYNPQTPGTPAMyntdqfSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHPTPSPMAYQASPSPSPVGYSP 879
Cdd:PRK07764   676 AAPAAPPPAPAP------AAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPP 749
                          170       180       190
                   ....*....|....*....|....*....|.
gi 1904527050  880 MTPGAPSPGGYNPHTPGSGIEQNSSDWVTTD 910
Cdd:PRK07764   750 DPAGAPAQPPPPPAPAPAAAPAAAPPPSPPS 780
KOW cd00380
KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known ...
524-561 1.32e-04

KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues.


Pssm-ID: 240504  Cd Length: 49  Bit Score: 40.28  E-value: 1.32e-04
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 1904527050  524 KDIVKVIDGPHSDREGEIRHLYHSFAFLHCKKLVENGG 561
Cdd:cd00380      1 GDVVRVLRGPYKGREGVVVDIDPRFGIVTVKGATGSKG 38
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
772-887 1.51e-04

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 45.68  E-value: 1.51e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  772 TPSPQAYGGTPNPQTPGY--PD-----PSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPS 844
Cdd:pfam05109  422 SKAPESTTTSPTLNTTGFaaPNtttglPSSTHVPTNLTAPASTGPTVSTADVTSPTPAGTTSGASPVTPSPSPRDNGTES 501
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|...
gi 1904527050  845 PAgyQNTHSPASYHPTPSPMAyqasPSPSPVGYSPmTPGAPSP 887
Cdd:pfam05109  502 KA--PDMTSPTSAVTTPTPNA----TSPTPAVTTP-TPNATSP 537
dnaA PRK14086
chromosomal replication initiator protein DnaA;
712-876 1.64e-04

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 45.59  E-value: 1.64e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  712 PMY-GSQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNnPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYP 790
Cdd:PRK14086   103 RRTsEPELPRPGRRPYEGYGGPRADDRPPGLPRQDQLPTAR-PAYPAYQQRPEPGAWPRAADDYGWQQQRLGFPPRAPYA 181
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  791 DPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQGSY-------QPSPSPQSYHQV--APSPAGYQNTHSPASYHPTP 861
Cdd:PRK14086   182 SPASYAPEQERDREPYDAGRPEYDQRRRDYDHPRPDWDRprrdrtdRPEPPPGAGHVHrgGPGPPERDDAPVVPIRPSAP 261
                          170
                   ....*....|....*
gi 1904527050  862 SPMAYQASPSPSPVG 876
Cdd:PRK14086   262 GPLAAQPAPAPGPGE 276
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
742-897 1.72e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 45.75  E-value: 1.72e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  742 PAQSGAWDPNNPNTPSRAEEEYEYAfdDEPTPS---PQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFS 818
Cdd:PRK07764   592 PGAAGGEGPPAPASSGPPEEAARPA--APAAPAapaAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGW 669
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  819 PYAAPSPQGSyQPSPSPQSYHQVAPS--PAGYQNTHSPASYHP---------TPSPMAYQASPSPSPVGYSPMTPGAPSP 887
Cdd:PRK07764   670 PAKAGGAAPA-APPPAPAPAAPAAPAgaAPAQPAPAPAATPPAgqaddpaaqPPQAAQGASAPSPAADDPVPLPPEPDDP 748
                          170
                   ....*....|
gi 1904527050  888 GGYNPHTPGS 897
Cdd:PRK07764   749 PDPAGAPAQP 758
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
718-887 1.77e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 45.53  E-value: 1.77e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  718 TPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEEYE--YAFDDEPTPSPQAYGGTPNPQTPGYPDP--- 792
Cdd:pfam03154   75 SPLKSAKRQREKGASDTEEPERATAKKSKTQEISRPNSPSEGEGESSdgRSVNDEGSSDPKDIDQDNRSTSPSIPSPqdn 154
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  793 -----SS----------PQVNPQYNPQTPGTPAMYNTDQfSPYAAPSPQGsyqPSPSPQSYHQVAPSPAGYQNTHSPASY 857
Cdd:pfam03154  155 esdsdSSaqqqilqtqpPVLQAQSGAASPPSPPPPGTTQ-AATAGPTPSA---PSVPPQGSPATSQPPNQTQSTAAPHTL 230
                          170       180       190
                   ....*....|....*....|....*....|
gi 1904527050  858 HPTPSPMAYQASPSPSPvGYSPMTPGAPSP 887
Cdd:pfam03154  231 IQQTPTLHPQRLPSPHP-PLQPMTQPPPPS 259
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
402-430 1.82e-04

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 39.68  E-value: 1.82e-04
                           10        20        30
                   ....*....|....*....|....*....|.
gi 1904527050  402 GDHVKVIAGRFEGDTGLIVRVEE--NFIILF 430
Cdd:pfam00467    2 GDVVRVIAGPFKGKVGKVVEVDDkkNRVLVE 32
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
737-897 2.21e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 45.55  E-value: 2.21e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  737 DGSRTPAQSGAWDPNNPNTPSRAeeeyeyAFDDEPTPSPQAYGGTPNPQTPGYPDPSSP--QVNPQYNPQTPGTPAMYNT 814
Cdd:PHA03307   238 DSSSSESSGCGWGPENECPLPRP------APITLPTRIWEASGWNGPSSRPGPASSSSSprERSPSPSPSSPGSGPAPSS 311
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  815 DQFSPYAAPSPQGSyQPSPSPQSyhqVAPSPAGyqnTHSPASYHPTPSPmayqASPSPSPVGYSPMTPGAPSPGGYNPHT 894
Cdd:PHA03307   312 PRASSSSSSSRESS-SSSTSSSS---ESSRGAA---VSPGPSPSRSPSP----SRPPPPADPSSPRKRPRPSRAPSSPAA 380

                   ...
gi 1904527050  895 PGS 897
Cdd:PHA03307   381 SAG 383
PHA03247 PHA03247
large tegument protein UL36; Provisional
753-895 2.66e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 45.31  E-value: 2.66e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  753 PNTPS-RAEEEYEYAFDDEPTPSPQAyGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMY----NTDQFSPYAAPSPQG 827
Cdd:PHA03247  2475 PGAPVyRRPAEARFPFAAGAAPDPGG-GGPPDPDAPPAPSRLAPAILPDEPVGEPVHPRMLtwirGLEELASDDAGDPPP 2553
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  828 SYQPSPSPQSYHQVAPSPagyqnthSPASYHPTPSPMAYQASPSPSPVGYSPMTPGAP--SPGGYNPHTP 895
Cdd:PHA03247  2554 PLPPAAPPAAPDRSVPPP-------RPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDrgDPRGPAPPSP 2616
Pneumo_att_G pfam05539
Pneumovirinae attachment membrane glycoprotein G;
685-861 2.74e-04

Pneumovirinae attachment membrane glycoprotein G;


Pssm-ID: 114270 [Multi-domain]  Cd Length: 408  Bit Score: 44.65  E-value: 2.74e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  685 PGGMTTTYGRTPMyGSQTPMYGSGSRTPMYGSQT----PLQDG-SRTPHYGSQTPLHDgSRTPAQSGAWDP--NNPNTPS 757
Cdd:pfam05539  201 TQGHQTATANQRL-SSTEPVGTQGTTTSSNPEPQteppPSQRGpSGSPQHPPSTTSQD-QSTTGDGQEHTQrrKTPPATS 278
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  758 RAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPyaaPSPQ------GSYQP 831
Cdd:pfam05539  279 NRRSPHSTATPPPTTKRQETGRPTPRPTATTQSGSSPPHSSPPGVQANPTTQNLVDCKELDP---PKPNsicygvGIYNE 355
                          170       180       190
                   ....*....|....*....|....*....|
gi 1904527050  832 SpSPQSYHQVAPSPAGYqNTHSPASYHPTP 861
Cdd:pfam05539  356 A-LPRGCDIVVPLCSTY-TIMCMDTYYSKP 383
PTZ00395 PTZ00395
Sec24-related protein; Provisional
715-893 3.19e-04

Sec24-related protein; Provisional


Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 45.07  E-value: 3.19e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  715 GSQTPLQDGSRTPHYGSQTPL-HDGSRTPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGTP--NP--QTPGY 789
Cdd:PTZ00395   345 GSPNAASAGAPFNGLGNQADGgHINQVHPDARGAWAGGPHSNASYNCAAYSNAAQSNAAQSNAGFSNAGysNPgnSNPGY 424
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  790 PDP---SSPQVNPQY------NPQTPGTPamYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHPT 860
Cdd:PTZ00395   425 NNApnsNTPYNNPPNsntpysNPPNSNPP--YSNLPYSNTPYSNAPLSNAPPSSAKDHHSAYHAAYQHRAANQPAANLPT 502
                          170       180       190
                   ....*....|....*....|....*....|...
gi 1904527050  861 PSPMAyqASPSPSPVGYSPMTPGAPSPGGYNPH 893
Cdd:PTZ00395   503 ANQPA--ANNFHGAAGNSVGNPFASRPFGSAPY 533
DUF1373 pfam07117
Protein of unknown function (DUF1373); This family consists of several hypothetical proteins ...
753-867 3.20e-04

Protein of unknown function (DUF1373); This family consists of several hypothetical proteins which seem to be specific to Oryzias latipes (Japanese ricefish). Members of this family are typically around 200 residues in length. The function of this family is unknown.


Pssm-ID: 462093 [Multi-domain]  Cd Length: 212  Bit Score: 43.24  E-value: 3.20e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  753 PNTPSRAEEEYEY----AFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQGS 828
Cdd:pfam07117   42 PPRPEEEEGQGGGggtfPFPGSPEPEPGGGGSGPMPMSASAPEPEPAKAKPQRPAPAQGHGHGGGGDSDSSGSGSGHQGS 121
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|..
gi 1904527050  829 YQP---SPSPQSYHQVAPSPAGYQNTHSPasyHPTPSPMAYQ 867
Cdd:pfam07117  122 GGAgagAGAPGHQHEQEQESSSSDDDDED---EFEFTPEEDE 160
dnaA PRK14086
chromosomal replication initiator protein DnaA;
770-896 3.67e-04

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 44.43  E-value: 3.67e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  770 EPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQGSYQPSPSPQsyhqvAPSPAGYQ 849
Cdd:PRK14086    94 EPAPPPPHARRTSEPELPRPGRRPYEGYGGPRADDRPPGLPRQDQLPTARPAYPAYQQRPEPGAWPR-----AADDYGWQ 168
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*...
gi 1904527050  850 NT-HSPASYHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGyNPHTPG 896
Cdd:PRK14086   169 QQrLGFPPRAPYASPASYAPEQERDREPYDAGRPEYDQRRR-DYDHPR 215
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
743-887 4.09e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 44.37  E-value: 4.09e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  743 AQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGT---PNPQTPGYPDPSSPQVNPQYNPQTpgTPAMYNTDQFSP 819
Cdd:pfam03154  176 AQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATsqpPNQTQSTAAPHTLIQQTPTLHPQR--LPSPHPPLQPMT 253
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  820 YAAPSPQGSYQPSPSPQSY------------------HQVAPSPAGYQNTHSPASYHPTPSPMAyqASPSPSPVGYSPMT 881
Cdd:pfam03154  254 QPPPPSQVSPQPLPQPSLHgqmppmphslqtgpshmqHPVPPQPFPLTPQSSQSQVPPGPSPAA--PGQSQQRIHTPPSQ 331

                   ....*.
gi 1904527050  882 PGAPSP 887
Cdd:pfam03154  332 SQLQSQ 337
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
773-904 4.10e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 44.48  E-value: 4.10e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  773 PSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTH 852
Cdd:PRK12323   392 PAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAAPAAAARPAAAGPRP 471
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 1904527050  853 SPASYHPTPSPMAYQASPSPSPVGYSP---MTPGAPSPGGYNPHTPGSGIEQNSS 904
Cdd:PRK12323   472 VAAAAAAAPARAAPAAAPAPADDDPPPweeLPPEFASPAPAQPDAAPAGWVAESI 526
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
723-895 4.13e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 44.39  E-value: 4.13e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  723 GSRTPHYGSQTPLHDGSRTPAQSGAWDPnnPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQvnPQYN 802
Cdd:PHA03307   773 ALLEPAEPQRGAGSSPPVRAEAAFRRPG--RLRRSGPAADAASRTASKRKSRSHTPDGGSESSGPARPPGAAAR--PPPA 848
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  803 PQTPGTPAMyntDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASyHPTPSPMAyqaspsPSPVGYSPMTP 882
Cdd:PHA03307   849 RSSESSKSK---PAAAGGRARGKNGRRRPRPPEPRARPGAAAPPKAAAAAPPAG-APAPRPRP------APRVKLGPMPP 918
                          170       180
                   ....*....|....*....|
gi 1904527050  883 GAPSP-GGY------NPHTP 895
Cdd:PHA03307   919 GGPDPrGGFrrvppgDLHTP 938
PRK10263 PRK10263
DNA translocase FtsK; Provisional
712-877 4.19e-04

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 44.69  E-value: 4.19e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  712 PMYGSQTPLQDGSR--TPHYGSQTPlhDGSRTPAQSGaWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGY 789
Cdd:PRK10263   345 PVASVDVPPAQPTVawQPVPGPQTG--EPVIAPAPEG-YPQQSQYAQPAVQYNEPLQQPVQPQQPYYAPAAEQPAQQPYY 421
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  790 -PDPSSPQVNPQYNPQtPGTPAMYNtdqfsPYAAPSPQGSYQPSPSPQSYHQ-VAPSPAGYQNTHSPASYHPT---PSPM 864
Cdd:PRK10263   422 aPAPEQPAQQPYYAPA-PEQPVAGN-----AWQAEEQQSTFAPQSTYQTEQTyQQPAAQEPLYQQPQPVEQQPvvePEPV 495
                          170
                   ....*....|...
gi 1904527050  865 AYQASPSPSPVGY 877
Cdd:PRK10263   496 VEETKPARPPLYY 508
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
782-968 4.23e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 44.03  E-value: 4.23e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  782 PNPQTPGYPDPSSPQVNPQYNPQTPGTPAMyntdqfspyAAPSPQGSYQPSPSPQSyhQVAPSPAGYQNTHSPASYHPTP 861
Cdd:PRK14950   364 PAPQPAKPTAAAPSPVRPTPAPSTRPKAAA---------AANIPPKEPVRETATPP--PVPPRPVAPPVPHTPESAPKLT 432
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  862 spmayqasPSPSPVGYSPMTPGAPSPGGYNP--HTPGSGIEQNSSDWVTTDIQVKVRDTYLdtQVVGQTGViRSVTggmc 939
Cdd:PRK14950   433 --------RAAIPVDEKPKYTPPAPPKEEEKalIADGDVLEQLEAIWKQILRDVPPRSPAV--QALLSSGV-RPVS---- 497
                          170       180       190
                   ....*....|....*....|....*....|
gi 1904527050  940 svyLKDSEKVVSISSE-HLEPITPTKNNKV 968
Cdd:PRK14950   498 ---VEKNTLTLSFKSKfHKDKIEEPENRKI 524
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
719-888 5.90e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 44.01  E-value: 5.90e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  719 PLQDGSRTPHYGSQT--PLHDGSRTPAQSGAWDPNNP-------NTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGY 789
Cdd:PHA03307   195 PSTPPAAASPRPPRRssPISASASSPAPAPGRSAADDagasssdSSSSESSGCGWGPENECPLPRPAPITLPTRIWEASG 274
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  790 PDPSSPQVNPQYNPQTPGtpamyntdqfSPYAAPSP---QGSYQPSPSPQSYHQVAPSPAGyqnTHSPASYHPTPSPMAY 866
Cdd:PHA03307   275 WNGPSSRPGPASSSSSPR----------ERSPSPSPsspGSGPAPSSPRASSSSSSSRESS---SSSTSSSSESSRGAAV 341
                          170       180
                   ....*....|....*....|..
gi 1904527050  867 QASPSPSPVGYSPMTPGAPSPG 888
Cdd:PHA03307   342 SPGPSPSRSPSPSRPPPPADPS 363
Med26_M pfam15694
Mediator complex subunit 26 middle domain; Med26_M is the middle domain of subunit 26 of ...
785-901 9.33e-04

Mediator complex subunit 26 middle domain; Med26_M is the middle domain of subunit 26 of Mediator. Med19 and Med26 act synergistically to mediate the interaction between REST (a Kruppel-type zinc finger transcription factor that binds to a 21-bp RE1 silencing element present in over 900 human genes) and Mediator.


Pssm-ID: 464807 [Multi-domain]  Cd Length: 255  Bit Score: 42.17  E-value: 9.33e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  785 QTPGYPDPSSPQvNPQYNPQTpgTPAMYNTDQFSPYAapsPQGSY-QPSPSPQSYHQVAPSPAGYQNTHSP--------A 855
Cdd:pfam15694   81 ETGGPPQPKSPR-CSSFSPRN--SRHETFARRSSTYA---PKGSVpSPSPRSQVLDAQVPSPLPLSQPSTPpvqakrleK 154
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|..
gi 1904527050  856 SYHPTP-SPMAYQASPS-----PSPVGYSPMTPGAPSPGGYNPHTPGSGIEQ 901
Cdd:pfam15694  155 PPQSSPeSSQHWLEQSDseshqRHQDGSATLLSQSVSPGCKTPLHPGENSLP 206
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
694-887 1.02e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 43.24  E-value: 1.02e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  694 RTPMYGSQTPMYGSGSRTPMYGSQTPLQDGSRTPhygsQTPLHDGSRTPAQSGAWDPNnPNTPSRAEeeyeyafdDEPTP 773
Cdd:PHA03307    64 RFEPPTGPPPGPGTEAPANESRSTPTWSLSTLAP----ASPAREGSPTPPGPSSPDPP-PPTPPPAS--------PPPSP 130
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  774 SPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTdqfSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGyqnths 853
Cdd:PHA03307   131 APDLSEMLRPVGSPGPPPAASPPAAGASPAAVASDAASSRQ---AALPLSSPEETARAPSSPPAEPPPSTPPAA------ 201
                          170       180       190
                   ....*....|....*....|....*....|....
gi 1904527050  854 PASYHPTPSPMAyqASPSPSPVGYSPMTPGAPSP 887
Cdd:PHA03307   202 ASPRPPRRSSPI--SASASSPAPAPGRSAADDAG 233
PHA03369 PHA03369
capsid maturational protease; Provisional
741-838 1.08e-03

capsid maturational protease; Provisional


Pssm-ID: 223061 [Multi-domain]  Cd Length: 663  Bit Score: 43.06  E-value: 1.08e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  741 TPAQSGAWDPNnPNTPSRAEEE-YEYAFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQVnpqynPQTPGTPAMYNTDQFSP 819
Cdd:PHA03369   353 LTAPSRVLAAA-AKVAVIAAPQtHTGPADRQRPQRPDGIPYSVPARSPMTAYPPVPQF-----CGDPGLVSPYNPQSPGT 426
                           90
                   ....*....|....*....
gi 1904527050  820 YAAPSPQGSYQPSPSPQSY 838
Cdd:PHA03369   427 SYGPEPVGPVPPQPTNPYV 445
dnaA PRK14086
chromosomal replication initiator protein DnaA;
750-898 1.18e-03

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 42.89  E-value: 1.18e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  750 PNNPNTPSRAEEeyeyafddeptPSPQAYGGTPNPQTPGYPDPSSPQVNPQYnPQTPGTPAMY--NTDQFSPYAAPSPQG 827
Cdd:PRK14086    96 APPPPHARRTSE-----------PELPRPGRRPYEGYGGPRADDRPPGLPRQ-DQLPTARPAYpaYQQRPEPGAWPRAAD 163
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  828 SYQPSPSPQSYhqvaPSPAGYQnthSPASYHPTPSPMAY----------QASPSPSPVGYSPMTPGA-------PSPGGY 890
Cdd:PRK14086   164 DYGWQQQRLGF----PPRAPYA---SPASYAPEQERDREpydagrpeydQRRRDYDHPRPDWDRPRRdrtdrpePPPGAG 236

                   ....*...
gi 1904527050  891 NPHTPGSG 898
Cdd:PRK14086   237 HVHRGGPG 244
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
779-898 1.19e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 43.05  E-value: 1.19e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  779 GGTPNPQTPGYPDPSSPQVnPQYNPQTPGTPAMyntdqfspyAAPSPQGSYQPSPSPQSYHQVAPSPAgyqnthSPASYH 858
Cdd:PRK07764   389 GGAGAPAAAAPSAAAAAPA-AAPAPAAAAPAAA---------AAPAPAAAPQPAPAPAPAPAPPSPAG------NAPAGG 452
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|
gi 1904527050  859 PTPSPMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTPGSG 898
Cdd:PRK07764   453 APSPPPAAAPSAQPAPAPAAAPEPTAAPAPAPPAAPAPAA 492
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
769-910 1.56e-03

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 42.45  E-value: 1.56e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  769 DEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQynPQTPGTPAMYNTDQFSPYAAPSPQ-GSYQPSPSPQSYH-QVAPSPA 846
Cdd:NF033839   370 EKPKPEVKPQPETPKPEVKPQPEKPKPEVKPQ--PEKPKPEVKPQPEKPKPEVKPQPEkPKPEVKPQPEKPKpEVKPQPE 447
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  847 GYQNTHSPASYHPTPSPMAYQASPSPSpVGYSPMTP----GAPSPGGYNPHTPG--SGIEQNSSDWVTTD 910
Cdd:NF033839   448 KPKPEVKPQPETPKPEVKPQPEKPKPE-VKPQPEKPkpdnSKPQADDKKPSTPNnlSKDKQPSNQASTNE 516
PHA03369 PHA03369
capsid maturational protease; Provisional
781-918 1.60e-03

capsid maturational protease; Provisional


Pssm-ID: 223061 [Multi-domain]  Cd Length: 663  Bit Score: 42.29  E-value: 1.60e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  781 TPNPQTPGYPDPSSPQVNPqynPQTPGTPAM---YNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSpAGYQNTHSPASY 857
Cdd:PHA03369   353 LTAPSRVLAAAAKVAVIAA---PQTHTGPADrqrPQRPDGIPYSVPARSPMTAYPPVPQFCGDPGLV-SPYNPQSPGTSY 428
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1904527050  858 HPTPSPMAYQASPSPS--PVGYSPMT-PGAPSPGGYnpHTPGS-GIEQNSSDWVTTDIQVKVRDT 918
Cdd:PHA03369   429 GPEPVGPVPPQPTNPYvmPISMANMVyPGHPQEHGH--ERKRKrGGELKEELIETLKLVKKLKEE 491
PHA03325 PHA03325
nuclear-egress-membrane-like protein; Provisional
731-892 1.79e-03

nuclear-egress-membrane-like protein; Provisional


Pssm-ID: 223044  Cd Length: 418  Bit Score: 41.79  E-value: 1.79e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  731 SQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAyggtpnpqtpgYPDPSSPQVNPQYNPQTPGTPA 810
Cdd:PHA03325   266 SSLPTSAPKRRSRRAGAMRAAAGETADLADDDGSEHSDPEPLPASLP-----------PPPVRRPRVKHPEAGKEEPDGA 334
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  811 MYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGyqnthSPASYHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGY 890
Cdd:PHA03325   335 RNAEAKEPAQPATSTSSKGSSSAQNKDSGSTGPGSSL-----AAASSFLEDDDFGSPPLDLTTSLRHMPSPSVTSAPEPP 409

                   ..
gi 1904527050  891 NP 892
Cdd:PHA03325   410 SI 411
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
771-888 2.30e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 41.90  E-value: 2.30e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  771 PTPSPQAYGGTPNPQTPgypDPSSPQVNPQYNPQTPGTPAmyntdqfSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQN 850
Cdd:PRK07764   404 AAPAAAPAPAAAAPAAA---AAPAPAAAPQPAPAPAPAPA-------PPSPAGNAPAGGAPSPPPAAAPSAQPAPAPAAA 473
                           90       100       110
                   ....*....|....*....|....*....|....*...
gi 1904527050  851 THSPASyhPTPSPMAYQASPSPSPVgysPMTPGAPSPG 888
Cdd:PRK07764   474 PEPTAA--PAPAPPAAPAPAAAPAA---PAAPAAPAGA 506
PHA03264 PHA03264
envelope glycoprotein D; Provisional
763-899 2.61e-03

envelope glycoprotein D; Provisional


Pssm-ID: 223029 [Multi-domain]  Cd Length: 416  Bit Score: 41.53  E-value: 2.61e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  763 YEYAFDDEPTPSPQayGGTPNPqtPGYPDPsspQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQGSYQPSPSPQSyhqva 842
Cdd:PHA03264   258 FEESKGYEPPPAPS--GGSPAP--PGDDRP---EAKPEPGPVEDGAPGRETGGEGEGPEPAGRDGAAGGEPKPGP----- 325
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 1904527050  843 pspagyqnthspasyhPTPSPMAYQAS--PSPSPVGYSPMTPGAPSPGGYNPHTPGSGI 899
Cdd:PHA03264   326 ----------------PRPAPDADRPEgwPSLEAITFPPPTPATPAVPRARPVIVGTGI 368
KLF1_2_4_N cd21972
N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins; Kruppel ...
756-895 2.85e-03

N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins; Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the related N-terminal domains of KLF1, KLF2, KLF4, and similar proteins.


Pssm-ID: 409230 [Multi-domain]  Cd Length: 194  Bit Score: 39.97  E-value: 2.85e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  756 PSRAEEEYEYAFDDEPTPSPQAYG----GTPNPQTPGYPDPSSPQVNPQYN-PQTPGTPAMYNTDQFSPYAAPSPQGSYQ 830
Cdd:cd21972     22 LDLEFILSNTVTSDNDNPPPPDPAypppESPESCSTVYDSDGCHPTPNAYCgPNGPGLPGHFLLAGNSPNLGPKIKTENQ 101
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1904527050  831 PS-------PSPQSYHQVAPS------PAGYQNTHSPASYHPTPSPMAYQASPSPSPVGYSPmtPGAPSPGGYNPHTP 895
Cdd:cd21972    102 EQacmpvagYSGHYGPREPQRvppappPPQYAGHFQYHGHFNMFSPPLRANHPGMSTVMLTP--LSTPPLGFLSPEEA 177
PHA03269 PHA03269
envelope glycoprotein C; Provisional
808-909 2.95e-03

envelope glycoprotein C; Provisional


Pssm-ID: 165527 [Multi-domain]  Cd Length: 566  Bit Score: 41.64  E-value: 2.95e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  808 TPAMYN---TDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHPTPSPMAYQASPSPSPVGYSPMTPGA 884
Cdd:PHA03269    28 IPELHTsaaTQKPDPAPAPHQAASRAPDPAVAPTSAASRKPDLAQAPTPAASEKFDPAPAPHQAASRAPDPAVAPQLAAA 107
                           90       100
                   ....*....|....*....|....*
gi 1904527050  885 PSPggyNPHTPGSGIEQNSSDWVTT 909
Cdd:PHA03269   108 PKP---DAAEAFTSAAQAHEAPADA 129
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
708-846 4.17e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 41.12  E-value: 4.17e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  708 GSRTPMYGSQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEEYEyafddeptPSPQAYGGTPNPQTP 787
Cdd:PRK07764   674 GGAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASA--------PSPAADDPVPLPPEP 745
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 1904527050  788 GYPDPSSPQVNPQYNPQTPGTPAmyntdqfspyAAPSPQGSYQPSPSPQSYHQVAPSPA 846
Cdd:PRK07764   746 DDPPDPAGAPAQPPPPPAPAPAA----------APAAAPPPSPPSEEEEMAEDDAPSMD 794
KLF3_N cd21577
N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called ...
818-893 4.48e-03

N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called Krueppel-like factor 3 and originally called Basic Kruppel-like Factor/BKLF), was the third member of the KLF family of zinc finger transcription factors to be discovered. KLF3 possesses a wide range of biological impacts on regulating apoptosis, differentiation, and proliferation in various tissues during the entire progression process. It has been proposed as a tumor suppressor in colorectal cancer. It appears to function predominantly as a repressor of transcription, turning genes off by recruiting the C-terminal Binding Protein co-repressors CtBP1 and CtBP2. CtBP docks onto a short motif (residues 61-65) in the N-terminus of KLF3, through the Proline-X-Aspartate-Leucine-Serine (PXDLS) motif. CtBP in turn recruits histone modifying enzymes to alter chromatin and repress gene expression. KLF3 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF3.


Pssm-ID: 410554 [Multi-domain]  Cd Length: 214  Bit Score: 39.64  E-value: 4.48e-03
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1904527050  818 SPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHsPASYHP--TPSPMAYQASPSPSPVGYSPMTpgAPSPGGYNPH 893
Cdd:cd21577     41 SSSSSSSSPSSRASPPSPYSKSSPPSPPQQRPLSP-PLSLPPpvAPPPLSPGSVPGGLPVISPVMV--QPVPVLYPPH 115
KREPA2 cd23959
Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of ...
696-887 5.99e-03

Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of the parasitic protozoan's KREPA RNA editing catalytic complex (RECC). Kinetoplastid RNA editing (KRE) proteins occur as pairs or sets of related proteins in multiple complexes. KREPA complex is composed of six components (KREPA1-6), which share a conserved C-terminal region containing an oligonucleotide-binding (OB)-fold-like domain. KREPAs are responsible for the site-specific insertion and deletion of U nucleotides in the kinetoplastid mitochondria pre-messenger RNA. Apart from the conserved C-terminal OB-fold domain, KREPA1, KREPA2, and KREPA3 contain two conserved C2H2 zinc-finger domains. KREPA2 and kinetoplastid RNA editing ligase 1 (KREL1) are specific for ligation post-U-deletion and are paralogous to KREL2 and KREPA1 that are specific for ligation post-U-insertion. KREPA2, is critical for RECC stability and KREL1 integration into the complex.


Pssm-ID: 467780 [Multi-domain]  Cd Length: 424  Bit Score: 40.24  E-value: 5.99e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  696 PMYGSQTPMYGSGSRTPMYGSQTPLQDGSRTPHYGSQTPLhDGSRTPAQSGAWDPNNP----NTPSRAEEEYEYAFDDEP 771
Cdd:cd23959     56 PLYGAVSPEGENPFDGPGLVTASTVSDCYVGNANFYEVDM-SDAFAMAPDESLGPFRAarvpNPFSASSSTQRETHKTAQ 134
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527050  772 TPSPQAYGGTPnPQTPGYPDPSSPQVNPqynPQTPGTPAMYNTDQ-FSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQN 850
Cdd:cd23959    135 VAPPKAEPQTA-PVTPFGQLPMFGQHPP---PAKPLPAAAAAQQSsASPGEVASPFASGTVSASPFATATDTAPSSGAPD 210
                          170       180       190
                   ....*....|....*....|....*....|....*..
gi 1904527050  851 THSPASyhPTPSPMAyqASPSPSPVGYSPMTPGAPSP 887
Cdd:cd23959    211 GFPAEA--SAPSPFA--APASAASFPAAPVANGEAAT 243
KOW smart00739
KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.
346-373 8.32e-03

KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.


Pssm-ID: 128978  Cd Length: 28  Bit Score: 34.61  E-value: 8.32e-03
                            10        20
                    ....*....|....*....|....*...
gi 1904527050   346 NFQPGDNVEVCEGELINLQGKVLSVDGN 373
Cdd:smart00739    1 KFEVGDTVRVIAGPFKGKVGKVLEVDGE 28
KOW cd00380
KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known ...
965-1007 8.48e-03

KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues.


Pssm-ID: 240504  Cd Length: 49  Bit Score: 35.27  E-value: 8.48e-03
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*
gi 1904527050  965 NNKVKVILGEDREATGVLLSIDGEDGIVRMDLDEQLK--ILNLRF 1007
Cdd:cd00380      1 GDVVRVLRGPYKGREGVVVDIDPRFGIVTVKGATGSKgaELKVRF 45
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH