NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|291241653|ref|XP_002740721|]
View 

PREDICTED: transcription elongation factor SPT5 isoform X1 [Saccoglossus kowalevskii]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
NGN_Euk cd09888
Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW ...
194-281 1.34e-41

Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW domain-containing Transcription Factor 1); The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus. Spt5 forms an Spt4-Spt5 complex that is an essential RNA polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli, and has a variety of functions such as its involvement in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. Spt5-like is homologous to the Spt5 proteins present in all eukaryotes, which is unique as it encodes a protein with an additional long carboxy-terminal extension that contains WG/GW motifs. Spt5-like, or KTF1 (KOW domain-containing Transcription Factor 1), is a RNA-directed DNA methylation (RdDM) pathway effector in plants.


:

Pssm-ID: 193577 [Multi-domain]  Cd Length: 86  Bit Score: 146.91  E-value: 1.34e-41
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  194 NLWMVKCKMGEEKATAVTLMRKFIAYQVQDEPLQIKSVIAVEGLKGYIYVESYKQTHVKHAIEGIGNLRlgLWTQQMVPI 273
Cdd:cd09888     1 KLWAVKCKPGKEREIVISLMRKFLDLQRTGNPLGIKSVFARDGLKGYIYIEARKEAHVKDAIEGLRGVY--LNTIKLVPI 78

                  ....*...
gi 291241653  274 KEMPDVLK 281
Cdd:cd09888    79 KEMPDVLS 86
KOW_Spt5_3 cd06083
KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
489-539 1.06e-27

KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240507  Cd Length: 51  Bit Score: 106.07  E-value: 1.06e-27
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|.
gi 291241653  489 HFRMGDHVKVIGGRYEGDTGLIVRVEDNMVILFSDLTMHELKVLPRDLQLC 539
Cdd:cd06083     1 HFKVGDHVKVISGRHEGETGLVVKVEDDVVTVFSDLTMRELKVFPRDLQLS 51
KOW_Spt5_6 cd06086
KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
1033-1094 2.73e-27

KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240510  Cd Length: 58  Bit Score: 105.29  E-value: 2.73e-27
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 291241653 1033 DHLEPVMPSKQDKVKVILGEDRESTGTLINIDGQDGIVKMDqavgSDVQLKILHLGHLGKFV 1094
Cdd:cd06086     1 EHLEPVPPEKGDRVKVIKGEDRGSTGELISIDGADGIVKMD----SDGDIKILPMNFLAKLV 58
KOW_Spt5_5 cd06085
KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
714-764 1.16e-25

KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240509  Cd Length: 52  Bit Score: 100.25  E-value: 1.16e-25
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|.
gi 291241653  714 DMELIGQTVRIREGPFKGYIGIVKDATESTARVELHTNCKTISVDKNRLNP 764
Cdd:cd06085     2 RDPLIGKTVRIRKGPYKGYIGIVKDATGTTARVELHSKNKTITVDRSRLAV 52
KOW_Spt5_2 cd06082
KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
438-488 3.72e-24

KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240506  Cd Length: 51  Bit Score: 96.03  E-value: 3.72e-24
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|.
gi 291241653  438 LAPGDSVEVIEGELVHLQGKIIGVDGDKITIMPKHEDLKEPLDFPARELRK 488
Cdd:cd06082     1 FQPGDNVEVIEGELKGLQGKVESVDGDIVTIMPKHEDLKEPLEFPAKELRK 51
KOW_Spt5_4 cd06084
KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
615-657 2.65e-22

KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240508  Cd Length: 43  Bit Score: 90.66  E-value: 2.65e-22
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|...
gi 291241653  615 KDIVKVIDGPHSGRQGEVKHIYRSFAFLHSRLMTENGGIFVCR 657
Cdd:cd06084     1 GDTVKVVDGPYKGRQGTVLHIYRGTLFLHSREVTENGGIFVVR 43
CTD smart01104
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
784-919 1.25e-17

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif.


:

Pssm-ID: 215026 [Multi-domain]  Cd Length: 121  Bit Score: 79.87  E-value: 1.25e-17
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653    784 PGQTPMY---GSRTPMYGSQTP----LHDGSRTPHYGSQTPLHDG--SRTPGQtGAWDPTNRNTPARtdefdyrfdeptp 854
Cdd:smart01104    1 GGRTPAWgasGSKTPAWGSRTPgtaaGGAPTARGGSGSRTPAWGGagSRTPAW-GGAGPTGSRTPAW------------- 66
                            90       100       110       120       130       140
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 291241653    855 SPAYGGTPNPATPGYSADTPPSNGPYTPATPGSSTMYSSsdhtYSPyqhsstPSPGGYQGTPSPA 919
Cdd:smart01104   67 GGASAWGNKSSEGSASSWAAGPGGAYGAPTPGYGGTPSA----YGP------ATPGGGAMAGSAS 121
KOW_Spt5_1 cd06081
KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
292-329 1.60e-17

KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240505  Cd Length: 38  Bit Score: 76.74  E-value: 1.60e-17
                          10        20        30
                  ....*....|....*....|....*....|....*...
gi 291241653  292 KSWVRLKRGVFKDDLAQVDYVEPAQNQVTLKLVPRVDY 329
Cdd:cd06081     1 GSWVRIKRGIYKGDLAQVDEVDENGNRVVVKLIPRIDY 38
Spt5_N pfam11942
Spt5 transcription elongation factor, acidic N-terminal; This is the very acidic N-terminal ...
95-188 8.79e-06

Spt5 transcription elongation factor, acidic N-terminal; This is the very acidic N-terminal region of the early transcription elongation factor Spt5. The Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The actual function of this N-terminal domain is not known although it is dispensable for binding to Spt4.


:

Pssm-ID: 463406  Cd Length: 97  Bit Score: 45.34  E-value: 8.79e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653    95 FILDEADVDDDYDDAVEEEEVEEGFQDLIDRNATTGDVEGDSHGAR------RLHQMWREQKEDEIEEYYKRKYADTTSg 168
Cdd:pfam11942    1 FIDDEAEVDDDEEEEEDEDEDEDGADDFIEDDEEDEDEEDGRRDDRrhreldRRRELEEDEDAEEIAEYLKERYGRSSS- 79
                           90       100
                   ....*....|....*....|
gi 291241653   169 GRYDSnyEMSDDITQQGLLP 188
Cdd:pfam11942   80 DAYRG--DAEEGVPQRLLLP 97
dnaA super family cl42516
chromosomal replication initiator protein DnaA;
875-985 7.40e-05

chromosomal replication initiator protein DnaA;


The actual alignment was detected with superfamily member PRK14086:

Pssm-ID: 455861 [Multi-domain]  Cd Length: 617  Bit Score: 46.74  E-value: 7.40e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  875 PSNGPYTPATPGSStMYSSSDhtyspyqhSSTPSPGGYQGTPSPA--NYQPAPSPGGYQPTPSPAY---QQSPSPGGYQL 949
Cdd:PRK14086   90 PSAGEPAPPPPHAR-RTSEPE--------LPRPGRRPYEGYGGPRadDRPPGLPRQDQLPTARPAYpayQQRPEPGAWPR 160
                          90       100       110
                  ....*....|....*....|....*....|....*.
gi 291241653  950 TPSPGGYPMTPGAPSPGgfNPLTPGASLDSGSSEWQ 985
Cdd:PRK14086  161 AADDYGWQQQRLGFPPR--APYASPASYAPEQERDR 194
 
Name Accession Description Interval E-value
NGN_Euk cd09888
Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW ...
194-281 1.34e-41

Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW domain-containing Transcription Factor 1); The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus. Spt5 forms an Spt4-Spt5 complex that is an essential RNA polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli, and has a variety of functions such as its involvement in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. Spt5-like is homologous to the Spt5 proteins present in all eukaryotes, which is unique as it encodes a protein with an additional long carboxy-terminal extension that contains WG/GW motifs. Spt5-like, or KTF1 (KOW domain-containing Transcription Factor 1), is a RNA-directed DNA methylation (RdDM) pathway effector in plants.


Pssm-ID: 193577 [Multi-domain]  Cd Length: 86  Bit Score: 146.91  E-value: 1.34e-41
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  194 NLWMVKCKMGEEKATAVTLMRKFIAYQVQDEPLQIKSVIAVEGLKGYIYVESYKQTHVKHAIEGIGNLRlgLWTQQMVPI 273
Cdd:cd09888     1 KLWAVKCKPGKEREIVISLMRKFLDLQRTGNPLGIKSVFARDGLKGYIYIEARKEAHVKDAIEGLRGVY--LNTIKLVPI 78

                  ....*...
gi 291241653  274 KEMPDVLK 281
Cdd:cd09888    79 KEMPDVLS 86
Spt5-NGN pfam03439
Early transcription elongation factor of RNA pol II, NGN section; Spt5p and prokaryotic NusG ...
194-280 2.94e-30

Early transcription elongation factor of RNA pol II, NGN section; Spt5p and prokaryotic NusG are shown to contain a novel 'NGN' domain. The combined NGN and KOW motif regions of Spt5 form the binding domain with Spt4. Spt5 complexes with Spt4 as a 1:1 heterodimer snf this Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The Schizosaccharomyces pombe core Spt5-Spt4 complex is a heterodimer bearing a trypsin-resistant Spt4-binding domain within the Spt5 subunit.


Pssm-ID: 397481  Cd Length: 84  Bit Score: 114.60  E-value: 2.94e-30
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   194 NLWMVKCKMGEEKATAVTLMRKFIAYQvQDEPLQIKSVIAVEGLKGYIYVESYKQTHVKHAIEGIGNLRLGlwTQQMVPI 273
Cdd:pfam03439    1 KIWAVKCTPGQEREVALSLMRKILALA-KTNNLGIYSVFAPDGLKGYIYVEADRQAAVKRALEGIPNVRGL--VPGLVPI 77

                   ....*..
gi 291241653   274 KEMPDVL 280
Cdd:pfam03439   78 KEMEHLL 84
KOW_Spt5_3 cd06083
KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
489-539 1.06e-27

KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240507  Cd Length: 51  Bit Score: 106.07  E-value: 1.06e-27
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|.
gi 291241653  489 HFRMGDHVKVIGGRYEGDTGLIVRVEDNMVILFSDLTMHELKVLPRDLQLC 539
Cdd:cd06083     1 HFKVGDHVKVISGRHEGETGLVVKVEDDVVTVFSDLTMRELKVFPRDLQLS 51
KOW_Spt5_6 cd06086
KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
1033-1094 2.73e-27

KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240510  Cd Length: 58  Bit Score: 105.29  E-value: 2.73e-27
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 291241653 1033 DHLEPVMPSKQDKVKVILGEDRESTGTLINIDGQDGIVKMDqavgSDVQLKILHLGHLGKFV 1094
Cdd:cd06086     1 EHLEPVPPEKGDRVKVIKGEDRGSTGELISIDGADGIVKMD----SDGDIKILPMNFLAKLV 58
KOW_Spt5_5 cd06085
KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
714-764 1.16e-25

KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240509  Cd Length: 52  Bit Score: 100.25  E-value: 1.16e-25
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|.
gi 291241653  714 DMELIGQTVRIREGPFKGYIGIVKDATESTARVELHTNCKTISVDKNRLNP 764
Cdd:cd06085     2 RDPLIGKTVRIRKGPYKGYIGIVKDATGTTARVELHSKNKTITVDRSRLAV 52
KOW_Spt5_2 cd06082
KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
438-488 3.72e-24

KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240506  Cd Length: 51  Bit Score: 96.03  E-value: 3.72e-24
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|.
gi 291241653  438 LAPGDSVEVIEGELVHLQGKIIGVDGDKITIMPKHEDLKEPLDFPARELRK 488
Cdd:cd06082     1 FQPGDNVEVIEGELKGLQGKVESVDGDIVTIMPKHEDLKEPLEFPAKELRK 51
KOW_Spt5_4 cd06084
KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
615-657 2.65e-22

KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240508  Cd Length: 43  Bit Score: 90.66  E-value: 2.65e-22
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|...
gi 291241653  615 KDIVKVIDGPHSGRQGEVKHIYRSFAFLHSRLMTENGGIFVCR 657
Cdd:cd06084     1 GDTVKVVDGPYKGRQGTVLHIYRGTLFLHSREVTENGGIFVVR 43
CTD smart01104
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
784-919 1.25e-17

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif.


Pssm-ID: 215026 [Multi-domain]  Cd Length: 121  Bit Score: 79.87  E-value: 1.25e-17
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653    784 PGQTPMY---GSRTPMYGSQTP----LHDGSRTPHYGSQTPLHDG--SRTPGQtGAWDPTNRNTPARtdefdyrfdeptp 854
Cdd:smart01104    1 GGRTPAWgasGSKTPAWGSRTPgtaaGGAPTARGGSGSRTPAWGGagSRTPAW-GGAGPTGSRTPAW------------- 66
                            90       100       110       120       130       140
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 291241653    855 SPAYGGTPNPATPGYSADTPPSNGPYTPATPGSSTMYSSsdhtYSPyqhsstPSPGGYQGTPSPA 919
Cdd:smart01104   67 GGASAWGNKSSEGSASSWAAGPGGAYGAPTPGYGGTPSA----YGP------ATPGGGAMAGSAS 121
KOW_Spt5_1 cd06081
KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
292-329 1.60e-17

KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240505  Cd Length: 38  Bit Score: 76.74  E-value: 1.60e-17
                          10        20        30
                  ....*....|....*....|....*....|....*...
gi 291241653  292 KSWVRLKRGVFKDDLAQVDYVEPAQNQVTLKLVPRVDY 329
Cdd:cd06081     1 GSWVRIKRGIYKGDLAQVDEVDENGNRVVVKLIPRIDY 38
nusG PRK08559
transcription antitermination protein NusG; Validated
191-332 3.84e-16

transcription antitermination protein NusG; Validated


Pssm-ID: 181467 [Multi-domain]  Cd Length: 153  Bit Score: 76.83  E-value: 3.84e-16
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  191 KDPNLWMVKCKMGEEKATAVTLMRKfiayqVQDEPLQIKSVIAVEGLKGYIYVESYKQTHVKHAIEGIGNLRlGLwTQQM 270
Cdd:PRK08559    4 EMSMIFAVKTTAGQERNVALMLAMR-----AKKENLPIYAILAPPELKGYVLVEAESKGAVEEAIRGIPHVR-GV-VPGE 76
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 291241653  271 VPIKEMPDVLKVVKEVVTLKPKSWVRLKRGVFKDDLAQVDYVEPAQNQVTLKL----VP-----RVDYSRP 332
Cdd:PRK08559   77 ISFEEVEHFLKPKPIVEGIKEGDIVELIAGPFKGEKARVVRVDESKEEVTVELleaaVPipvtvRGDQVRV 147
CTD pfam12815
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
785-840 1.35e-12

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteriztic TPA motif.


Pssm-ID: 372327 [Multi-domain]  Cd Length: 71  Bit Score: 64.00  E-value: 1.35e-12
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 291241653   785 GQTPMY----GSRTPMY---GSQTPLHD--GSRTPHY--GSQTPLHD--GSRTPGQTGAWDPTnrNTPA 840
Cdd:pfam12815    2 SRTPAYnsagGSRTPAWgadGSRTPAYGgaGGRTPAYnqGGKTPAWGgaGSRTPAYYGAWGGS--RTPA 68
NGN smart00738
In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold; In Spt5p, ...
194-282 2.19e-11

In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold; In Spt5p, this domain may confer affinity for Spt4p.Spt4p


Pssm-ID: 197850 [Multi-domain]  Cd Length: 106  Bit Score: 61.62  E-value: 2.19e-11
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653    194 NLWMVKCKMGEEKATAVTLMRKFIAYQVQDeplQIKSVIA-VEGLK----------------GYIYVESYKQTHVKHAIE 256
Cdd:smart00738    1 NWYAVRTTSGQEKRVAENLERKAEALGLED---KIVSILVpTEEVKeirrgkkkvverklfpGYIFVEADLEDEVWTAIR 77
                            90       100
                    ....*....|....*....|....*....
gi 291241653    257 GIGNLRLGLWT---QQMVPIKEMPDVLKV 282
Cdd:smart00738   78 GTPGVRGFVGGggkPTPVPDDEIEKILKP 106
KOW_elon_Spt5 TIGR00405
transcription elongation factor Spt5; This protein contains a KOW domain, shared by bacterial ...
198-324 5.56e-10

transcription elongation factor Spt5; This protein contains a KOW domain, shared by bacterial NusG and the uL24 (previously L24p/L26e) family of ribosomal proteins. The most recent papers and crystal structures make this a transcription elongation factor rather than a ribosomal protein.


Pssm-ID: 129499 [Multi-domain]  Cd Length: 145  Bit Score: 58.75  E-value: 5.56e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   198 VKCKMGEEKATAVTLMRKfiayqVQDEPLQIKSVIAVEGLKGYIYVESYKQTHVKHAIEGIGNLRlGLwTQQMVPIKEMP 277
Cdd:TIGR00405    3 VKTSVGQEKNVARLMARK-----ARKSGLEVYSILAPESLKGYILVEAETKIDMRNPIIGVPHVR-GV-VEGEIDFEEIE 75
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*..
gi 291241653   278 DVLKVVKEVVTLKPKSWVRLKRGVFKDDLAQVDYVEPAQNQVTLKLV 324
Cdd:TIGR00405   76 RFLTPKKIIESIKKGDIVEIISGPFKGERAKVIRVDESKEEVTLELI 122
PHA03247 PHA03247
large tegument protein UL36; Provisional
742-964 5.85e-09

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 60.72  E-value: 5.85e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  742 STARVELHTNCKTISVDKNRLNPVNQPTSR------GSTTSYRHTPlHPGQTPmyGSRTPMYGSQTPLHDGSRTPHYGSQ 815
Cdd:PHA03247 2657 APGRVSRPRRARRLGRAAQASSPPQRPRRRaarptvGSLTSLADPP-PPPPTP--EPAPHALVSATPLPPGPAAARQASP 2733
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  816 TPLHDGSRTPGQTGAWDPTNRNTPARtdefdyrfdEPTPSPAYGGTP--NPATPGYSADTPPSNGPYTPATPGSSTMYSS 893
Cdd:PHA03247 2734 ALPAAPAPPAVPAGPATPGGPARPAR---------PPTTAGPPAPAPpaAPAAGPPRRLTRPAVASLSESRESLPSPWDP 2804
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 291241653  894 SDHTYSPYQHSSTPSPGGYQGTPSPANYQPAPSPGGYQPTPSPAYQQ---SPSPGG-YQLTPSPGGYPMTPGAPS 964
Cdd:PHA03247 2805 ADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPlggSVAPGGdVRRRPPSRSPAAKPAAPA 2879
SPT5 COG5164
Transcription elongation factor SPT5 [Transcription];
768-980 9.15e-08

Transcription elongation factor SPT5 [Transcription];


Pssm-ID: 444063 [Multi-domain]  Cd Length: 495  Bit Score: 56.19  E-value: 9.15e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  768 PTSRGSTTSYRHT----PlhPGQTpmyGSRTPM--YGSQTPLHDGSRTPHYGSQT----PLHDGSRTPGQTGAWDPTNRN 837
Cdd:COG5164    21 AGSQGSTKPAQNQgstrP--AGNT---GGTRPAqnQGSTTPAGNTGGTRPAGNQGatgpAQNQGGTTPAQNQGGTRPAGN 95
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  838 TPARTDEFDYRFDEPtpsPAYGGTPNPATPGYSAdTPPSNGPYTPATPGSSTmysssdhtysPYQHSSTpSPGGYQGTPS 917
Cdd:COG5164    96 TGGTTPAGDGGATGP---PDDGGATGPPDDGGST-TPPSGGSTTPPGDGGST----------PPGPGST-GPGGSTTPPG 160
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 291241653  918 PANYQPAPSPGGYQPTPSPAYQQSPSPGGYQLTPSPGGypmtpGAPSPGGFNPLTPGASLDSG 980
Cdd:COG5164   161 DGGSTTPPGPGGSTTPPDDGGSTTPPNKGETGTDIPTG-----GTPRQGPDGPVKKDDKNGKG 218
Spt5_N pfam11942
Spt5 transcription elongation factor, acidic N-terminal; This is the very acidic N-terminal ...
95-188 8.79e-06

Spt5 transcription elongation factor, acidic N-terminal; This is the very acidic N-terminal region of the early transcription elongation factor Spt5. The Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The actual function of this N-terminal domain is not known although it is dispensable for binding to Spt4.


Pssm-ID: 463406  Cd Length: 97  Bit Score: 45.34  E-value: 8.79e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653    95 FILDEADVDDDYDDAVEEEEVEEGFQDLIDRNATTGDVEGDSHGAR------RLHQMWREQKEDEIEEYYKRKYADTTSg 168
Cdd:pfam11942    1 FIDDEAEVDDDEEEEEDEDEDEDGADDFIEDDEEDEDEEDGRRDDRrhreldRRRELEEDEDAEEIAEYLKERYGRSSS- 79
                           90       100
                   ....*....|....*....|
gi 291241653   169 GRYDSnyEMSDDITQQGLLP 188
Cdd:pfam11942   80 DAYRG--DAEEGVPQRLLLP 97
KLF1_2_4_N cd21972
N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins; Kruppel ...
852-973 1.99e-05

N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins; Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the related N-terminal domains of KLF1, KLF2, KLF4, and similar proteins.


Pssm-ID: 409230 [Multi-domain]  Cd Length: 194  Bit Score: 46.52  E-value: 1.99e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  852 PTPSPAYG--GTPNPATPGYSADTPPSNGPYTPATPGSSTM------YSSSDHTYSPYQ-------HSSTPSPGGYqGTP 916
Cdd:cd21972    40 PPPDPAYPppESPESCSTVYDSDGCHPTPNAYCGPNGPGLPghfllaGNSPNLGPKIKTenqeqacMPVAGYSGHY-GPR 118
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|....*....
gi 291241653  917 SPANYQPAPSPGGYQPTPSP-AYQQSPSPGGYQLTPSPGGYPMTP-GAPSPGGFNPLTP 973
Cdd:cd21972   119 EPQRVPPAPPPPQYAGHFQYhGHFNMFSPPLRANHPGMSTVMLTPlSTPPLGFLSPEEA 177
dnaA PRK14086
chromosomal replication initiator protein DnaA;
875-985 7.40e-05

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 46.74  E-value: 7.40e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  875 PSNGPYTPATPGSStMYSSSDhtyspyqhSSTPSPGGYQGTPSPA--NYQPAPSPGGYQPTPSPAY---QQSPSPGGYQL 949
Cdd:PRK14086   90 PSAGEPAPPPPHAR-RTSEPE--------LPRPGRRPYEGYGGPRadDRPPGLPRQDQLPTARPAYpayQQRPEPGAWPR 160
                          90       100       110
                  ....*....|....*....|....*....|....*.
gi 291241653  950 TPSPGGYPMTPGAPSPGgfNPLTPGASLDSGSSEWQ 985
Cdd:PRK14086  161 AADDYGWQQQRLGFPPR--APYASPASYAPEQERDR 194
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
718-749 8.19e-05

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 40.45  E-value: 8.19e-05
                           10        20        30
                   ....*....|....*....|....*....|..
gi 291241653   718 IGQTVRIREGPFKGYIGIVKDATESTARVELH 749
Cdd:pfam00467    1 KGDVVRVIAGPFKGKVGKVVEVDDKKNRVLVE 32
KOW smart00739
KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.
489-516 1.25e-04

KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.


Pssm-ID: 128978  Cd Length: 28  Bit Score: 40.00  E-value: 1.25e-04
                            10        20
                    ....*....|....*....|....*...
gi 291241653    489 HFRMGDHVKVIGGRYEGDTGLIVRVEDN 516
Cdd:smart00739    1 KFEVGDTVRVIAGPFKGKVGKVLEVDGE 28
DUF4813 pfam16072
Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. ...
919-1028 1.32e-04

Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 345 and 672 amino acids in length.


Pssm-ID: 435117 [Multi-domain]  Cd Length: 288  Bit Score: 45.13  E-value: 1.32e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   919 ANYQPAPS---PGGYQPTPSPAYQQSPSPGGYQLTPS---PGG---YPMTPGAPSPGGFNPLTPGASLDSGSSEWQtihi 989
Cdd:pfam16072    2 ATYHPAGAtyhPGGYAPAGATYHPAGQVPAGATYYPSggvPHGatyYPQAPVAAVPAGATYLPAGAAIPAGATYYP---- 77
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|.
gi 291241653   990 evkvKATHEDSALIYKIGVIRGISGG--MCSVFIPDEGRVV 1028
Cdd:pfam16072   78 ----QAPKSSSGLGLGTGLIAGALGGaiLGHALTPTQTRVV 114
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
493-523 2.14e-04

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 39.29  E-value: 2.14e-04
                           10        20        30
                   ....*....|....*....|....*....|.
gi 291241653   493 GDHVKVIGGRYEGDTGLIVRVEDNMVILFSD 523
Cdd:pfam00467    2 GDVVRVIAGPFKGKVGKVVEVDDKKNRVLVE 32
KOW smart00739
KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.
611-637 3.12e-04

KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.


Pssm-ID: 128978  Cd Length: 28  Bit Score: 38.85  E-value: 3.12e-04
                            10        20
                    ....*....|....*....|....*..
gi 291241653    611 NIQVKDIVKVIDGPHSGRQGEVKHIYR 637
Cdd:smart00739    1 KFEVGDTVRVIAGPFKGKVGKVLEVDG 27
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
614-643 3.68e-04

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 38.91  E-value: 3.68e-04
                           10        20        30
                   ....*....|....*....|....*....|
gi 291241653   614 VKDIVKVIDGPHSGRQGEVKHIYRSFAFLH 643
Cdd:pfam00467    1 KGDVVRVIAGPFKGKVGKVVEVDDKKNRVL 30
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
854-985 4.11e-04

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 42.08  E-value: 4.11e-04
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653    854 PSPAYGGTP------NPATPgYSADTPPSNG----PYTPATPGSSTMYSSSDHTYSPYQHSSTPSPGGYQGTPSPANYQP 923
Cdd:smart00818   22 PYPSYGYEPmggwlhHQIIP-VSQQHPPTHTlqphHHIPVLPAQQPVVPQQPLMPVPGQHSMTPTQHHQPNLPQPAQQPF 100
                            90       100       110       120       130       140
                    ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 291241653    924 APSPGGYQPTPSPAYQQSPSPGGYQLTPSPGGYPMTPGAPSPggfnPLTPGASLDSgsseWQ 985
Cdd:smart00818  101 QPQPLQPPQPQQPMQPQPPVHPIPPLPPQPPLPPMFPMQPLP----PLLPDLPLEA----WP 154
NusG COG0250
Transcription termination/antitermination protein NusG [Transcription];
718-748 3.39e-03

Transcription termination/antitermination protein NusG [Transcription];


Pssm-ID: 440020 [Multi-domain]  Cd Length: 171  Bit Score: 39.42  E-value: 3.39e-03
                          10        20        30
                  ....*....|....*....|....*....|.
gi 291241653  718 IGQTVRIREGPFKGYIGIVKDATESTARVEL 748
Cdd:COG0250   120 VGDRVRITDGPFAGFEGTVEEVDPEKGRVKV 150
 
Name Accession Description Interval E-value
NGN_Euk cd09888
Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW ...
194-281 1.34e-41

Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW domain-containing Transcription Factor 1); The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus. Spt5 forms an Spt4-Spt5 complex that is an essential RNA polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli, and has a variety of functions such as its involvement in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. Spt5-like is homologous to the Spt5 proteins present in all eukaryotes, which is unique as it encodes a protein with an additional long carboxy-terminal extension that contains WG/GW motifs. Spt5-like, or KTF1 (KOW domain-containing Transcription Factor 1), is a RNA-directed DNA methylation (RdDM) pathway effector in plants.


Pssm-ID: 193577 [Multi-domain]  Cd Length: 86  Bit Score: 146.91  E-value: 1.34e-41
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  194 NLWMVKCKMGEEKATAVTLMRKFIAYQVQDEPLQIKSVIAVEGLKGYIYVESYKQTHVKHAIEGIGNLRlgLWTQQMVPI 273
Cdd:cd09888     1 KLWAVKCKPGKEREIVISLMRKFLDLQRTGNPLGIKSVFARDGLKGYIYIEARKEAHVKDAIEGLRGVY--LNTIKLVPI 78

                  ....*...
gi 291241653  274 KEMPDVLK 281
Cdd:cd09888    79 KEMPDVLS 86
Spt5-NGN pfam03439
Early transcription elongation factor of RNA pol II, NGN section; Spt5p and prokaryotic NusG ...
194-280 2.94e-30

Early transcription elongation factor of RNA pol II, NGN section; Spt5p and prokaryotic NusG are shown to contain a novel 'NGN' domain. The combined NGN and KOW motif regions of Spt5 form the binding domain with Spt4. Spt5 complexes with Spt4 as a 1:1 heterodimer snf this Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The Schizosaccharomyces pombe core Spt5-Spt4 complex is a heterodimer bearing a trypsin-resistant Spt4-binding domain within the Spt5 subunit.


Pssm-ID: 397481  Cd Length: 84  Bit Score: 114.60  E-value: 2.94e-30
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   194 NLWMVKCKMGEEKATAVTLMRKFIAYQvQDEPLQIKSVIAVEGLKGYIYVESYKQTHVKHAIEGIGNLRLGlwTQQMVPI 273
Cdd:pfam03439    1 KIWAVKCTPGQEREVALSLMRKILALA-KTNNLGIYSVFAPDGLKGYIYVEADRQAAVKRALEGIPNVRGL--VPGLVPI 77

                   ....*..
gi 291241653   274 KEMPDVL 280
Cdd:pfam03439   78 KEMEHLL 84
KOW_Spt5_3 cd06083
KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
489-539 1.06e-27

KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240507  Cd Length: 51  Bit Score: 106.07  E-value: 1.06e-27
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|.
gi 291241653  489 HFRMGDHVKVIGGRYEGDTGLIVRVEDNMVILFSDLTMHELKVLPRDLQLC 539
Cdd:cd06083     1 HFKVGDHVKVISGRHEGETGLVVKVEDDVVTVFSDLTMRELKVFPRDLQLS 51
KOW_Spt5_6 cd06086
KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
1033-1094 2.73e-27

KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240510  Cd Length: 58  Bit Score: 105.29  E-value: 2.73e-27
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 291241653 1033 DHLEPVMPSKQDKVKVILGEDRESTGTLINIDGQDGIVKMDqavgSDVQLKILHLGHLGKFV 1094
Cdd:cd06086     1 EHLEPVPPEKGDRVKVIKGEDRGSTGELISIDGADGIVKMD----SDGDIKILPMNFLAKLV 58
KOW_Spt5_5 cd06085
KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
714-764 1.16e-25

KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240509  Cd Length: 52  Bit Score: 100.25  E-value: 1.16e-25
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|.
gi 291241653  714 DMELIGQTVRIREGPFKGYIGIVKDATESTARVELHTNCKTISVDKNRLNP 764
Cdd:cd06085     2 RDPLIGKTVRIRKGPYKGYIGIVKDATGTTARVELHSKNKTITVDRSRLAV 52
KOW_Spt5_2 cd06082
KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
438-488 3.72e-24

KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240506  Cd Length: 51  Bit Score: 96.03  E-value: 3.72e-24
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|.
gi 291241653  438 LAPGDSVEVIEGELVHLQGKIIGVDGDKITIMPKHEDLKEPLDFPARELRK 488
Cdd:cd06082     1 FQPGDNVEVIEGELKGLQGKVESVDGDIVTIMPKHEDLKEPLEFPAKELRK 51
KOW_Spt5_4 cd06084
KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
615-657 2.65e-22

KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240508  Cd Length: 43  Bit Score: 90.66  E-value: 2.65e-22
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|...
gi 291241653  615 KDIVKVIDGPHSGRQGEVKHIYRSFAFLHSRLMTENGGIFVCR 657
Cdd:cd06084     1 GDTVKVVDGPYKGRQGTVLHIYRGTLFLHSREVTENGGIFVVR 43
CTD smart01104
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
784-919 1.25e-17

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif.


Pssm-ID: 215026 [Multi-domain]  Cd Length: 121  Bit Score: 79.87  E-value: 1.25e-17
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653    784 PGQTPMY---GSRTPMYGSQTP----LHDGSRTPHYGSQTPLHDG--SRTPGQtGAWDPTNRNTPARtdefdyrfdeptp 854
Cdd:smart01104    1 GGRTPAWgasGSKTPAWGSRTPgtaaGGAPTARGGSGSRTPAWGGagSRTPAW-GGAGPTGSRTPAW------------- 66
                            90       100       110       120       130       140
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 291241653    855 SPAYGGTPNPATPGYSADTPPSNGPYTPATPGSSTMYSSsdhtYSPyqhsstPSPGGYQGTPSPA 919
Cdd:smart01104   67 GGASAWGNKSSEGSASSWAAGPGGAYGAPTPGYGGTPSA----YGP------ATPGGGAMAGSAS 121
KOW_Spt5_1 cd06081
KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
292-329 1.60e-17

KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240505  Cd Length: 38  Bit Score: 76.74  E-value: 1.60e-17
                          10        20        30
                  ....*....|....*....|....*....|....*...
gi 291241653  292 KSWVRLKRGVFKDDLAQVDYVEPAQNQVTLKLVPRVDY 329
Cdd:cd06081     1 GSWVRIKRGIYKGDLAQVDEVDENGNRVVVKLIPRIDY 38
nusG PRK08559
transcription antitermination protein NusG; Validated
191-332 3.84e-16

transcription antitermination protein NusG; Validated


Pssm-ID: 181467 [Multi-domain]  Cd Length: 153  Bit Score: 76.83  E-value: 3.84e-16
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  191 KDPNLWMVKCKMGEEKATAVTLMRKfiayqVQDEPLQIKSVIAVEGLKGYIYVESYKQTHVKHAIEGIGNLRlGLwTQQM 270
Cdd:PRK08559    4 EMSMIFAVKTTAGQERNVALMLAMR-----AKKENLPIYAILAPPELKGYVLVEAESKGAVEEAIRGIPHVR-GV-VPGE 76
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 291241653  271 VPIKEMPDVLKVVKEVVTLKPKSWVRLKRGVFKDDLAQVDYVEPAQNQVTLKL----VP-----RVDYSRP 332
Cdd:PRK08559   77 ISFEEVEHFLKPKPIVEGIKEGDIVELIAGPFKGEKARVVRVDESKEEVTVELleaaVPipvtvRGDQVRV 147
CTD pfam12815
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
785-840 1.35e-12

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteriztic TPA motif.


Pssm-ID: 372327 [Multi-domain]  Cd Length: 71  Bit Score: 64.00  E-value: 1.35e-12
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 291241653   785 GQTPMY----GSRTPMY---GSQTPLHD--GSRTPHY--GSQTPLHD--GSRTPGQTGAWDPTnrNTPA 840
Cdd:pfam12815    2 SRTPAYnsagGSRTPAWgadGSRTPAYGgaGGRTPAYnqGGKTPAWGgaGSRTPAYYGAWGGS--RTPA 68
NGN smart00738
In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold; In Spt5p, ...
194-282 2.19e-11

In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold; In Spt5p, this domain may confer affinity for Spt4p.Spt4p


Pssm-ID: 197850 [Multi-domain]  Cd Length: 106  Bit Score: 61.62  E-value: 2.19e-11
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653    194 NLWMVKCKMGEEKATAVTLMRKFIAYQVQDeplQIKSVIA-VEGLK----------------GYIYVESYKQTHVKHAIE 256
Cdd:smart00738    1 NWYAVRTTSGQEKRVAENLERKAEALGLED---KIVSILVpTEEVKeirrgkkkvverklfpGYIFVEADLEDEVWTAIR 77
                            90       100
                    ....*....|....*....|....*....
gi 291241653    257 GIGNLRLGLWT---QQMVPIKEMPDVLKV 282
Cdd:smart00738   78 GTPGVRGFVGGggkPTPVPDDEIEKILKP 106
CTD smart01104
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
791-963 2.98e-11

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif.


Pssm-ID: 215026 [Multi-domain]  Cd Length: 121  Bit Score: 61.77  E-value: 2.98e-11
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653    791 GSRTPMYGSqtplhDGSRTPHYGSQTPlhdgsrtpgqtgawdptnrntpartdefdyrfdeptpspAYGGTPNPATPGYS 870
Cdd:smart01104    1 GGRTPAWGA-----SGSKTPAWGSRTP---------------------------------------GTAAGGAPTARGGS 36
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653    871 ADTPPSNGPYTPATPGSSTMYSSSDHTySPYQHSSTPSPGGYQGTPSPANYQPAPSPGGyqptPSPAYQQSPSPGGYqlt 950
Cdd:smart01104   37 GSRTPAWGGAGSRTPAWGGAGPTGSRT-PAWGGASAWGNKSSEGSASSWAAGPGGAYGA----PTPGYGGTPSAYGP--- 108
                           170
                    ....*....|...
gi 291241653    951 PSPGGYPMTPGAP 963
Cdd:smart01104  109 ATPGGGAMAGSAS 121
KOW_elon_Spt5 TIGR00405
transcription elongation factor Spt5; This protein contains a KOW domain, shared by bacterial ...
198-324 5.56e-10

transcription elongation factor Spt5; This protein contains a KOW domain, shared by bacterial NusG and the uL24 (previously L24p/L26e) family of ribosomal proteins. The most recent papers and crystal structures make this a transcription elongation factor rather than a ribosomal protein.


Pssm-ID: 129499 [Multi-domain]  Cd Length: 145  Bit Score: 58.75  E-value: 5.56e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   198 VKCKMGEEKATAVTLMRKfiayqVQDEPLQIKSVIAVEGLKGYIYVESYKQTHVKHAIEGIGNLRlGLwTQQMVPIKEMP 277
Cdd:TIGR00405    3 VKTSVGQEKNVARLMARK-----ARKSGLEVYSILAPESLKGYILVEAETKIDMRNPIIGVPHVR-GV-VEGEIDFEEIE 75
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*..
gi 291241653   278 DVLKVVKEVVTLKPKSWVRLKRGVFKDDLAQVDYVEPAQNQVTLKLV 324
Cdd:TIGR00405   76 RFLTPKKIIESIKKGDIVEIISGPFKGERAKVIRVDESKEEVTLELI 122
CTD pfam12815
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
791-860 9.11e-10

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteriztic TPA motif.


Pssm-ID: 372327 [Multi-domain]  Cd Length: 71  Bit Score: 55.91  E-value: 9.11e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   791 GSRTPMYGSqtplHDGSRTPHY---GSQTPLHD--GSRTP-----GQTGAWDPTNRNTPARTDEFDyrfDEPTpsPAYGG 860
Cdd:pfam12815    1 GSRTPAYNS----AGGSRTPAWgadGSRTPAYGgaGGRTPaynqgGKTPAWGGAGSRTPAYYGAWG---GSRT--PAYGG 71
PHA03247 PHA03247
large tegument protein UL36; Provisional
742-964 5.85e-09

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 60.72  E-value: 5.85e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  742 STARVELHTNCKTISVDKNRLNPVNQPTSR------GSTTSYRHTPlHPGQTPmyGSRTPMYGSQTPLHDGSRTPHYGSQ 815
Cdd:PHA03247 2657 APGRVSRPRRARRLGRAAQASSPPQRPRRRaarptvGSLTSLADPP-PPPPTP--EPAPHALVSATPLPPGPAAARQASP 2733
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  816 TPLHDGSRTPGQTGAWDPTNRNTPARtdefdyrfdEPTPSPAYGGTP--NPATPGYSADTPPSNGPYTPATPGSSTMYSS 893
Cdd:PHA03247 2734 ALPAAPAPPAVPAGPATPGGPARPAR---------PPTTAGPPAPAPpaAPAAGPPRRLTRPAVASLSESRESLPSPWDP 2804
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 291241653  894 SDHTYSPYQHSSTPSPGGYQGTPSPANYQPAPSPGGYQPTPSPAYQQ---SPSPGG-YQLTPSPGGYPMTPGAPS 964
Cdd:PHA03247 2805 ADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPlggSVAPGGdVRRRPPSRSPAAKPAAPA 2879
PHA03378 PHA03378
EBNA-3B; Provisional
725-965 4.95e-08

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 57.38  E-value: 4.95e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  725 REGP--FKGYIGIVKDATESTARVeLHTNCKTISVDKNRLNPVNQPTS---RGSTTSYRHTP---LHPGQTPmygsRTPM 796
Cdd:PHA03378  536 RRAPcvYTEDLDIESDEPASTEPV-HDQLLPAPGLGPLQIQPLTSPTTsqlASSAPSYAQTPwpvPHPSQTP----EPPT 610
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  797 YGSQTP-LHDGSRTPHYGSQTPLHDGSRTPGqtgAWDPTNRNTPARTDEFDYRFDEPTPSPAYGGTPNPATPGYSADTPP 875
Cdd:PHA03378  611 TQSHIPeTSAPRQWPMPLRPIPMRPLRMQPI---TFNVLVFPTPHQPPQVEITPYKPTWTQIGHIPYQPSPTGANTMLPI 687
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  876 SNGPYT----PATPGSSTMYSSSDHTYSPYQHSSTPSPGGyQGTPSPANyQPAPSPGGYQPTPSPAYQQSPSPGGYQLTP 951
Cdd:PHA03378  688 QWAPGTmqppPRAPTPMRPPAAPPGRAQRPAAATGRARPP-AAAPGRAR-PPAAAPGRARPPAAAPGRARPPAAAPGRAR 765
                         250
                  ....*....|....
gi 291241653  952 SPGGypmTPGAPSP 965
Cdd:PHA03378  766 PPAA---APGAPTP 776
SPT5 COG5164
Transcription elongation factor SPT5 [Transcription];
768-980 9.15e-08

Transcription elongation factor SPT5 [Transcription];


Pssm-ID: 444063 [Multi-domain]  Cd Length: 495  Bit Score: 56.19  E-value: 9.15e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  768 PTSRGSTTSYRHT----PlhPGQTpmyGSRTPM--YGSQTPLHDGSRTPHYGSQT----PLHDGSRTPGQTGAWDPTNRN 837
Cdd:COG5164    21 AGSQGSTKPAQNQgstrP--AGNT---GGTRPAqnQGSTTPAGNTGGTRPAGNQGatgpAQNQGGTTPAQNQGGTRPAGN 95
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  838 TPARTDEFDYRFDEPtpsPAYGGTPNPATPGYSAdTPPSNGPYTPATPGSSTmysssdhtysPYQHSSTpSPGGYQGTPS 917
Cdd:COG5164    96 TGGTTPAGDGGATGP---PDDGGATGPPDDGGST-TPPSGGSTTPPGDGGST----------PPGPGST-GPGGSTTPPG 160
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 291241653  918 PANYQPAPSPGGYQPTPSPAYQQSPSPGGYQLTPSPGGypmtpGAPSPGGFNPLTPGASLDSG 980
Cdd:COG5164   161 DGGSTTPPGPGGSTTPPDDGGSTTPPNKGETGTDIPTG-----GTPRQGPDGPVKKDDKNGKG 218
NGN cd08000
N-Utilization Substance G (NusG) N-terminal (NGN) domain Superfamily; The N-Utilization ...
194-280 2.11e-07

N-Utilization Substance G (NusG) N-terminal (NGN) domain Superfamily; The N-Utilization Substance G (NusG) and its eukaryotic homolog Spt5 are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus in bacteria and archaea. The eukaryotic ortholog, Spt5, is a large protein composed of an acidic N-terminus, an NGN domain, and multiple KOW motifs at its C-terminus. Spt5 forms a Spt4-Spt5 complex that is an essential RNA Polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli and has a variety of functions, such as being involved in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but its functions and requirements are different. The diverse activities suggest that, after diverging from a common ancestor, NusG proteins became specialized in different bacteria.


Pssm-ID: 193574 [Multi-domain]  Cd Length: 99  Bit Score: 50.01  E-value: 2.11e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  194 NLWMVKCKMGEEKATAVTLMRKFIA---------YQVQDEPLQIKSVIAVEGLKGYIYVESYKQTHVKHAIEGIGNLRLG 264
Cdd:cd08000     1 NWYVLFVKTGREEKVEKLLEKRFEAndieafvpkKEVPERKRGKIEEVIKPLFPGYVFVETDLSPELYELIREVPGVIGI 80
                          90
                  ....*....|....*....
gi 291241653  265 LW---TQQMVPIKEMPDVL 280
Cdd:cd08000    81 LGngeEPSPVSDEEIEMIL 99
PHA03378 PHA03378
EBNA-3B; Provisional
761-965 2.36e-07

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 55.07  E-value: 2.36e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  761 RLNPVNQP-TSRGSTTSYRHTPLH--PGQTPMYGSRTPMYGSQTPL----HDGSRTPHYGSQTPLHDGSRT--------- 824
Cdd:PHA03378  595 TPWPVPHPsQTPEPPTTQSHIPETsaPRQWPMPLRPIPMRPLRMQPitfnVLVFPTPHQPPQVEITPYKPTwtqighipy 674
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  825 -PGQTGA-------WDPTNRNTPARTdefdyrfdePTPSPAYGGTPNPATPGYSADTPPSNGPYTP-------ATPGSST 889
Cdd:PHA03378  675 qPSPTGAntmlpiqWAPGTMQPPPRA---------PTPMRPPAAPPGRAQRPAAATGRARPPAAAPgrarppaAAPGRAR 745
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  890 MYSSSDHTYSPYQHSSTPS--PGGYQGTPSPANYQPAPSPGGYQPTPSPAYQQSPS--PGGYQLTPspggyPMTPGAPSP 965
Cdd:PHA03378  746 PPAAAPGRARPPAAAPGRArpPAAAPGAPTPQPPPQAPPAPQQRPRGAPTPQPPPQagPTSMQLMP-----RAAPGQQGP 820
NGN_Arch cd09887
Archaeal N-Utilization Substance G (NusG) N-terminal (NGN) domain; The N-Utilization Substance ...
195-262 2.41e-07

Archaeal N-Utilization Substance G (NusG) N-terminal (NGN) domain; The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. Transcription in archaea has a eukaryotic-type transcription apparatus, but contains bacterial-type transcription factors. NusG is one of the few archaeal transcription factors that has orthologs in both bacteria and eukaryotes. Archaeal NusG is similar to bacterial NusG, composed of an NGN domain and a Kyrpides Ouzounis and Woese (KOW) repeat. The eukaryotic ortholog, Spt5, is a large protein composed of an acidic N-terminus, an NGN domain, and multiple KOW motifs at its C-terminus. NusG was originally discovered as a N-dependent antitermination enhancing activity in Escherichia coli and has a variety of functions, such as being involved in RNA polymerase elongation and Rho-termination in bacteria. Archaeal NusG forms a complex with DNA-directed RNA polymerase subunit E (rpoE) that is similar to the Spt5-Spt4 complex in eukaryotes.


Pssm-ID: 193576  Cd Length: 82  Bit Score: 49.46  E-value: 2.41e-07
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 291241653  195 LWMVKCKMGEEKATAVTLMRKfiayqVQDEPLQIKSVIAVEGLKGYIYVESYKQTHVKHAIEGIGNLR 262
Cdd:cd09887     2 IYAVKTTAGQERNVADLLAMR-----AEKENLDVYSILVPEELKGYVFVEAEDPDRVEELIRGIPHVR 64
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
833-965 2.68e-07

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 54.92  E-value: 2.68e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   833 PTNRNTPARTDEFDYRFDEPTPSPAygGTPNPATPGYSADTPPSNGPYTPA---TPGSSTMYSSSDHTYSPYQHSSTPSP 909
Cdd:pfam05109  455 PTNLTAPASTGPTVSTADVTSPTPA--GTTSGASPVTPSPSPRDNGTESKApdmTSPTSAVTTPTPNATSPTPAVTTPTP 532
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 291241653   910 GGYQ------------GTPSPANyqPAPSPGGYQPTPS---PAYQQSpSPGGYQLTPSPGGYPMTPGAPSP 965
Cdd:pfam05109  533 NATSptlgktsptsavTTPTPNA--TSPTPAVTTPTPNatiPTLGKT-SPTSAVTTPTPNATSPTVGETSP 600
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
779-985 4.24e-07

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 54.39  E-value: 4.24e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   779 HTPlHPGQTPMYGSRTPMYGSQTPLHDGSRTPHYGSQTPLHDGSRTPGQTGawDPTnRNTPARTDEFDYRFDEPTPSPAY 858
Cdd:pfam03154  288 HMQ-HPVPPQPFPLTPQSSQSQVPPGPSPAAPGQSQQRIHTPPSQSQLQSQ--QPP-REQPLPPAPLSMPHIKPPPTTPI 363
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   859 GGTPNPAT----PGYSADTP---PSNGPYTPATPGSSTMysSSDHTYSPYQHSSTPSPGGYQGTPSPAnyQPapsPGGYQ 931
Cdd:pfam03154  364 PQLPNPQShkhpPHLSGPSPfqmNSNLPPPPALKPLSSL--STHHPPSAHPPPLQLMPQSQQLPPPPA--QP---PVLTQ 436
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 291241653   932 -PTPSPAYQQSPSPGGYQLTPSPGGYPMTPGAP--SPGGFNPLTPGASLDSGSSEWQ 985
Cdd:pfam03154  437 sQSLPPPAASHPPTSGLHQVPSQSPFPQHPFVPggPPPITPPSGPPTSTSSAMPGIQ 493
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
851-987 6.12e-07

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 53.84  E-value: 6.12e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  851 EPTPSPAYGGTPNPATPGYSADTPPSNGPYTPATPGSSTMYSSSDHTYSPYQHSSTPSPGGYQGTPSPANYQPAPSPGGY 930
Cdd:PRK07764  602 APASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGGAAPAAP 681
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|....*...
gi 291241653  931 QPTPSPAYQQSPSPG-GYQLTPSPGGYPMTPGAPSPGGFNPLTPGASLDSGSSEWQTI 987
Cdd:PRK07764  682 PPAPAPAAPAAPAGAaPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPV 739
Chi1 COG3469
Chitinase [Carbohydrate transport and metabolism];
809-972 2.97e-06

Chitinase [Carbohydrate transport and metabolism];


Pssm-ID: 442692 [Multi-domain]  Cd Length: 534  Bit Score: 51.29  E-value: 2.97e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  809 TPHYGSQTPLHDGSRTPGQTGAWDPTNRNTPARTDEFDYRfdePTPSPAYGGTPNPATPGYSADTPPSNGPYTPATPGSS 888
Cdd:COG3469    57 AGSGTGTTAASSTAATSSTTSTTATATAAAAAATSTSATL---VATSTASGANTGTSTVTTTSTGAGSVTSTTSSTAGST 133
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  889 TMYSSSDHTYSPYqHSSTPSPGGYQGTPSPANYQPAPSPGGYQPTPSPAYQQSPSPGGYQLTPSPggyPMTPGAPSPGGF 968
Cdd:COG3469   134 TTSGASATSSAGS-TTTTTTVSGTETATGGTTTTSTTTTTTSASTTPSATTTATATTASGATTPS---ATTTATTTGPPT 209

                  ....
gi 291241653  969 NPLT 972
Cdd:COG3469   210 PGLP 213
PHA03247 PHA03247
large tegument protein UL36; Provisional
781-980 3.64e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 51.48  E-value: 3.64e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  781 PLHPGQTPMYGSRTPMYGSQTPLHDGSRTPHYGSQTPLHDGSRTPGQ-TGAWDPTNRNTP--AR------TDEFDYRFDE 851
Cdd:PHA03247 2626 PPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRaAQASSPPQRPRRraARptvgslTSLADPPPPP 2705
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  852 PTPSP---------------AYGGTPNPATPGYSADTPPSNGPYTPAT---PGSSTMYSSSDHTYSPYQHSSTPSPGGYQ 913
Cdd:PHA03247 2706 PTPEPaphalvsatplppgpAAARQASPALPAAPAPPAVPAGPATPGGparPARPPTTAGPPAPAPPAAPAAGPPRRLTR 2785
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 291241653  914 GTPSPANYQPAPSPGGYQPTPSPA----------YQQSPSPGgyqLTPSPGGYPMTPGAPSPGGFNPLTPGASLDSG 980
Cdd:PHA03247 2786 PAVASLSESRESLPSPWDPADPPAavlapaaalpPAASPAGP---LPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPG 2859
PHA03247 PHA03247
large tegument protein UL36; Provisional
823-975 4.07e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 51.48  E-value: 4.07e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  823 RTPGQTGAwDPTNRNTPARTDEFDYRFDEPTPSPAygGTPNPATPGYSADTPPSNGPYTPATPGSSTMYSSSDHTYSPYQ 902
Cdd:PHA03247 2599 RAPVDDRG-DPRGPAPPSPLPPDTHAPDPPPPSPS--PAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQ 2675
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  903 HSSTPSPGGYQGTPSP-----ANYQPAPSPGGYQPTPSPAYQQSPSPGGYQ-------------LTPSPGGYPMTPGAPS 964
Cdd:PHA03247 2676 ASSPPQRPRRRAARPTvgsltSLADPPPPPPTPEPAPHALVSATPLPPGPAaarqaspalpaapAPPAVPAGPATPGGPA 2755
                         170
                  ....*....|.
gi 291241653  965 PGGFNPLTPGA 975
Cdd:PHA03247 2756 RPARPPTTAGP 2766
AF-4 pfam05110
AF-4 proto-oncoprotein N-terminal region; This family consists of AF4 (Proto-oncogene AF4) and ...
754-942 5.53e-06

AF-4 proto-oncoprotein N-terminal region; This family consists of AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental retardation syndrome) nuclear proteins. These proteins have been linked to human diseases such as acute lymphoblastic leukaemia and mental retardation. The family also contains a Drosophila AF4 protein homolog Lilliputian which contains an AT-hook domain. Lilliputian represents a novel pair-rule gene that acts in cytoskeleton regulation, segmentation and morphogenesis in Drosophila.


Pssm-ID: 461550 [Multi-domain]  Cd Length: 514  Bit Score: 50.51  E-value: 5.53e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   754 TISVDKNRLNPVNQPTSRGST---TSYRHTPLHP--GQTPMYGSRTPMYGSQTPLhDGSRTPHYGSQTPLHDGSRT---- 824
Cdd:pfam05110   82 QTPQEKPDQPFFPDKTSGLPPsfhTSSHSQPMGPpsSSSPSVSSSQSQKKSQART-EPAHGGHSSSGSQSSQRSQGqsrs 160
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   825 -PGQTGAWDPTNRNTPARTDEFDYRFDEPTPSPAyggtpNPATPGYSADTPPSNGPYTpATPGSSTMYSSSDHTYSPYQH 903
Cdd:pfam05110  161 kGGQESHSSSHHKRQERREDLFSCASLSHSLEEL-----SPLLSSLSSPVKPLSPSHS-RQHTGSKAQNSSDHHGKEYSH 234
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*...
gi 291241653   904 SSTP--SPGGYQGTPSPANYQPAPSPGGYQPT-PSP------AYQQSP 942
Cdd:pfam05110  235 SKSPrdSEAGSHGPESPSTSLLASSSQLSSQTfPPSlpsktsAMQQKP 282
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
765-973 6.87e-06

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 50.30  E-value: 6.87e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   765 VNQPTSRG-STTSYRHTPLHPGQTPMYGSRTPMYGSQTPlhdgsrTPHYGSQTPLHDgSRTPGQT----GAWDPTNRNTP 839
Cdd:pfam05109  513 VTTPTPNAtSPTPAVTTPTPNATSPTLGKTSPTSAVTTP------TPNATSPTPAVT-TPTPNATiptlGKTSPTSAVTT 585
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   840 ARTDEFDYRFDEPTPSP-----AYGGT---PNPATPGYSADTPPSNGPYTPATPGSSTMY------------SSSDHTYS 899
Cdd:pfam05109  586 PTPNATSPTVGETSPQAnttnhTLGGTsstPVVTSPPKNATSAVTTGQHNITSSSTSSMSlrpssisetlspSTSDNSTS 665
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   900 --PYQHSSTPSPGG--YQGTPSPANYQ------PAPSPGgyqptpspAYQQSPSPGGYQLTPSPGGYPMTPGAPSPGGFN 969
Cdd:pfam05109  666 hmPLLTSAHPTGGEniTQVTPASTSTHhvstssPAPRPG--------TTSQASGPGNSSTSTKPGEVNVTKGTPPKNATS 737

                   ....
gi 291241653   970 PLTP 973
Cdd:pfam05109  738 PQAP 741
Spt5_N pfam11942
Spt5 transcription elongation factor, acidic N-terminal; This is the very acidic N-terminal ...
95-188 8.79e-06

Spt5 transcription elongation factor, acidic N-terminal; This is the very acidic N-terminal region of the early transcription elongation factor Spt5. The Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The actual function of this N-terminal domain is not known although it is dispensable for binding to Spt4.


Pssm-ID: 463406  Cd Length: 97  Bit Score: 45.34  E-value: 8.79e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653    95 FILDEADVDDDYDDAVEEEEVEEGFQDLIDRNATTGDVEGDSHGAR------RLHQMWREQKEDEIEEYYKRKYADTTSg 168
Cdd:pfam11942    1 FIDDEAEVDDDEEEEEDEDEDEDGADDFIEDDEEDEDEEDGRRDDRrhreldRRRELEEDEDAEEIAEYLKERYGRSSS- 79
                           90       100
                   ....*....|....*....|
gi 291241653   169 GRYDSnyEMSDDITQQGLLP 188
Cdd:pfam11942   80 DAYRG--DAEEGVPQRLLLP 97
Pro-rich pfam15240
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
850-966 9.85e-06

Proline-rich protein; This family includes several eukaryotic proline-rich proteins.


Pssm-ID: 464580 [Multi-domain]  Cd Length: 167  Bit Score: 46.95  E-value: 9.85e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   850 DEPTPSPAYG-GTPNPATPGYSADTPPSNGPYTPATPGSStmySSSDHTYSPYQHSSTPSPGGYQgtPSPANYQPAPSPG 928
Cdd:pfam15240   34 EEEGQSQQGGqGPQGPPPGGFPPQPPASDDPPGPPPPGGP---QQPPPQGGKQKPQGPPPQGGPR--PPPGKPQGPPPQG 108
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|
gi 291241653   929 GYQPT--PSPAYQQSPSPGGyQLTPSPGGYPMTPGAPSPG 966
Cdd:pfam15240  109 GNQQQgpPPPGKPQGPPPQG-GGPPPQGGNQQGPPPPPPG 147
KOW cd00380
KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known ...
493-537 1.22e-05

KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues.


Pssm-ID: 240504  Cd Length: 49  Bit Score: 43.36  E-value: 1.22e-05
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|....*....
gi 291241653  493 GDHVKVIGGRYEGDTGLIVRVED----NMVILFSDLTMHELKVLPRDLQ 537
Cdd:cd00380     1 GDVVRVLRGPYKGREGVVVDIDPrfgiVTVKGATGSKGAELKVRFDDVD 49
PHA03377 PHA03377
EBNA-3C; Provisional
757-976 1.35e-05

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 49.28  E-value: 1.35e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  757 VDKNRLNPVN--QPTSRGSTTSYR--HTPL----HPGQTPMYGSRTPMYGSQTPLHDGSRTPHY-----------GSQTP 817
Cdd:PHA03377  701 SEESHLSSMSptQPISHEEQPRYEdpDDPLdlslHPDQAPPPSHQAPYSGHEEPQAQQAPYPGYweprppqapylGYQEP 780
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  818 LHDG---SRTPGQTGAWdptnrntPARTDEFDYRfdeptpspaYGGTPNPATPGYSadtpPSNGPYTPATPGSSTMYSSS 894
Cdd:PHA03377  781 QAQGvqvSSYPGYAGPW-------GLRAQHPRYR---------HSWAYWSQYPGHG----HPQGPWAPRPPHLPPQWDGS 840
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  895 -----DHTYSPYQHSSTPSPGGYQGTPSPANYQPAPSPGGYQP---TPSPAYQQSPSPGGYQLTPSPGGYPMTPGAPSPG 966
Cdd:PHA03377  841 aghgqDQVSQFPHLQSETGPPRLQLSQVPQLPYSQTLVSSSAPswsSPQPRAPIRPIPTRFPPPPMPLQDSMAVGCDSSG 920
                         250
                  ....*....|
gi 291241653  967 GFNPLTPGAS 976
Cdd:PHA03377  921 TACPSMPFAS 930
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
767-985 1.49e-05

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 49.40  E-value: 1.49e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  767 QPTSRGSTTSYRHTPLHPGQTPMYGSRTPmygsqTPLHDGSRTPHYGSQTPLHDGSRTPGQTGAWDPTNRNTPARtdefD 846
Cdd:PHA03307  190 PAEPPPSTPPAAASPRPPRRSSPISASAS-----SPAPAPGRSAADDAGASSSDSSSSESSGCGWGPENECPLPR----P 260
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  847 YRFDEPTPS-PAYGGTPNPATPGYSADTPPSNGPYTPATPGSS-TMYSSSDHTYSPYQHSSTPSpggyqGTPSPANYQPA 924
Cdd:PHA03307  261 APITLPTRIwEASGWNGPSSRPGPASSSSSPRERSPSPSPSSPgSGPAPSSPRASSSSSSSRES-----SSSSTSSSSES 335
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 291241653  925 PSPGGYQPTPSPAYQQSPSPGGYQLTPSPggyPMTPGAPSPGGFNPLTPGASLDSGSSEWQ 985
Cdd:PHA03307  336 SRGAAVSPGPSPSRSPSPSRPPPPADPSS---PRKRPRPSRAPSSPAASAGRPTRRRARAA 393
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
821-988 1.50e-05

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 49.38  E-value: 1.50e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   821 GSRTPGQTGAWDPTNRNTPArtdefdyrfdePTPS-PAYGGTPNPATpgysadTPPSNGPYTPATPGSSTMYSSSDHtys 899
Cdd:pfam03154  179 GAASPPSPPPPGTTQAATAG-----------PTPSaPSVPPQGSPAT------SQPPNQTQSTAAPHTLIQQTPTLH--- 238
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   900 PYQHSSTPSPggYQGTPSPanyqPAPSPGGYQPTPSPAYQQSPSPGGYQLTPSPGGYPMtPGAPSPGGFNPLT------P 973
Cdd:pfam03154  239 PQRLPSPHPP--LQPMTQP----PPPSQVSPQPLPQPSLHGQMPPMPHSLQTGPSHMQH-PVPPQPFPLTPQSsqsqvpP 311
                          170
                   ....*....|....*
gi 291241653   974 GASLDSGSSEWQTIH 988
Cdd:pfam03154  312 GPSPAAPGQSQQRIH 326
KLF1_2_4_N cd21972
N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins; Kruppel ...
852-973 1.99e-05

N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins; Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the related N-terminal domains of KLF1, KLF2, KLF4, and similar proteins.


Pssm-ID: 409230 [Multi-domain]  Cd Length: 194  Bit Score: 46.52  E-value: 1.99e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  852 PTPSPAYG--GTPNPATPGYSADTPPSNGPYTPATPGSSTM------YSSSDHTYSPYQ-------HSSTPSPGGYqGTP 916
Cdd:cd21972    40 PPPDPAYPppESPESCSTVYDSDGCHPTPNAYCGPNGPGLPghfllaGNSPNLGPKIKTenqeqacMPVAGYSGHY-GPR 118
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|....*....
gi 291241653  917 SPANYQPAPSPGGYQPTPSP-AYQQSPSPGGYQLTPSPGGYPMTP-GAPSPGGFNPLTP 973
Cdd:cd21972   119 EPQRVPPAPPPPQYAGHFQYhGHFNMFSPPLRANHPGMSTVMLTPlSTPPLGFLSPEEA 177
PHA03247 PHA03247
large tegument protein UL36; Provisional
781-974 2.51e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 48.78  E-value: 2.51e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  781 PLHPGQTPMYGSRTPMYGSQTPLHDGSRTPHYGSQTPLHDGSRTPGQTGAWDPTNRNTPARTDEFDYRFDEPTPSPAYGG 860
Cdd:PHA03247 2772 PAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLP 2851
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  861 TPNPATPG--YSADTPPSNGPYTPATPGSSTMYSSSDHTYSPYQHSSTPSPGGYQGTPSPANYQPAPSPGGYQPTPSPAy 938
Cdd:PHA03247 2852 LGGSVAPGgdVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQ- 2930
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|
gi 291241653  939 QQSPSPGGYQLTPSPGGYPMTPGAPSPGGFNP----LTPG 974
Cdd:PHA03247 2931 PPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPwlgaLVPG 2970
PHA03247 PHA03247
large tegument protein UL36; Provisional
780-975 2.95e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 48.40  E-value: 2.95e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  780 TPLHPGQTPMYGSRTPMYGSQTPLHDGSRTPHYG-SQTPLHDGSRTPGQTGAwdPTNRNTPARTDEFDYRFDEPTPSPAY 858
Cdd:PHA03247 2741 PPAVPAGPATPGGPARPARPPTTAGPPAPAPPAApAAGPPRRLTRPAVASLS--ESRESLPSPWDPADPPAAVLAPAAAL 2818
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  859 GGTPNPATP------GYSADTPPSNGPYTPATPGSSTMYSSSDHTYSPYQHSSTPSPGGYQGTPSPANYQPAPS------ 926
Cdd:PHA03247 2819 PPAASPAGPlppptsAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSrstesf 2898
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|...
gi 291241653  927 ---PGGYQPTPSPAYQQSPSPGG-YQLTPSPGGYPMTPGAPSPggfnPLTPGA 975
Cdd:PHA03247 2899 alpPDQPERPPQPQAPPPPQPQPqPPPPPQPQPPPPPPPRPQP----PLAPTT 2947
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
825-970 3.40e-05

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 48.08  E-value: 3.40e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   825 PGQTGAWDPTNRNtPARTDEFDYRfdePTPSPAYGGTPNPATPGYSADTPPSNGPYTPATPGsstmysssdhtyspYQHS 904
Cdd:pfam09606  351 TWNPGNFGGLGAN-PMQRGQPGMM---SSPSPVPGQQVRQVTPNQFMRQSPQPSVPSPQGPG--------------SQPP 412
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 291241653   905 STPSPGGyqgTPSPAnYQPAPSPggyQPTPSPAYQQSPSpggyqlTPSPGGYPMTPG---APSPggFNP 970
Cdd:pfam09606  413 QSHPGGM---IPSPA-LIPSPSP---QMSQQPAQQRTIG------QDSPGGSLNTPGqsaVNSP--LNP 466
KOW_NusG cd06091
NusG contains an NGN domain at its N-terminus and KOW motif at its C-terminus; KOW_NusG motif ...
718-748 5.33e-05

NusG contains an NGN domain at its N-terminus and KOW motif at its C-terminus; KOW_NusG motif is one of the two domains of N-Utilization Substance G (NusG) a transcription elongation and Rho-termination factor in bacteria and archaea. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The eukaryotic ortholog of NusG is Spt5 with multiple KOW motifs at its C-terminus.


Pssm-ID: 240515 [Multi-domain]  Cd Length: 56  Bit Score: 41.67  E-value: 5.33e-05
                          10        20        30
                  ....*....|....*....|....*....|...
gi 291241653  718 IGQTVRIREGPFKGYIGIVK--DATESTARVEL 748
Cdd:cd06091     6 VGDTVRIISGPFAGFEGKVEeiDEEKGKVKVLV 38
Chi1 COG3469
Chitinase [Carbohydrate transport and metabolism];
769-919 6.03e-05

Chitinase [Carbohydrate transport and metabolism];


Pssm-ID: 442692 [Multi-domain]  Cd Length: 534  Bit Score: 47.05  E-value: 6.03e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  769 TSRGSTTSYRHTPLHPGQTPMYGSRTPMYGSQTPLHDGSRTPHYGSQTPLHDGSRTPGQTGAWDPTNRNTPARTDEFDYR 848
Cdd:COG3469    67 SSTAATSSTTSTTATATAAAAAATSTSATLVATSTASGANTGTSTVTTTSTGAGSVTSTTSSTAGSTTTSGASATSSAGS 146
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 291241653  849 FDEPTPSPAYGGTPNPATPGYSADTPPSNGPYTPATPGSSTmysSSDHTYSPYQHSSTPSPGGYQGTPSPA 919
Cdd:COG3469   147 TTTTTTVSGTETATGGTTTTSTTTTTTSASTTPSATTTATA---TTASGATTPSATTTATTTGPPTPGLPK 214
KOW cd00380
KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known ...
615-652 6.04e-05

KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues.


Pssm-ID: 240504  Cd Length: 49  Bit Score: 41.44  E-value: 6.04e-05
                          10        20        30
                  ....*....|....*....|....*....|....*...
gi 291241653  615 KDIVKVIDGPHSGRQGEVKHIYRSFAFLHSRLMTENGG 652
Cdd:cd00380     1 GDVVRVLRGPYKGREGVVVDIDPRFGIVTVKGATGSKG 38
PHA03247 PHA03247
large tegument protein UL36; Provisional
764-990 7.32e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 47.24  E-value: 7.32e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  764 PVNQPTSRGSTTSYRHTPLHPGQTPmygsrTPMYGSQTPLHDGSRTPHYGSQTPLHDGSR--TPGQTGAWDPTNRNTPAR 841
Cdd:PHA03247 2810 AVLAPAAALPPAASPAGPLPPPTSA-----QPTAPPPPPGPPPPSLPLGGSVAPGGDVRRrpPSRSPAAKPAAPARPPVR 2884
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  842 tdefdyRFDEPTPSPAYGGTPNPATPGYSADTPPSNGPYTPATPgsstmysssdhtySPYQHSSTPSPGGYQGTPSPANY 921
Cdd:PHA03247 2885 ------RLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQ-------------PPPPPQPQPPPPPPPRPQPPLAP 2945
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 291241653  922 QPAPSPGGyQPTPSPAYQQ--SPSPGGYQ----LTPSPGGYPMTPGAPSPggfnplTPGASLDSGSSEWQT---IHIE 990
Cdd:PHA03247 2946 TTDPAGAG-EPSGAVPQPWlgALVPGRVAvprfRVPQPAPSREAPASSTP------PLTGHSLSRVSSWASslaLHEE 3016
dnaA PRK14086
chromosomal replication initiator protein DnaA;
875-985 7.40e-05

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 46.74  E-value: 7.40e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  875 PSNGPYTPATPGSStMYSSSDhtyspyqhSSTPSPGGYQGTPSPA--NYQPAPSPGGYQPTPSPAY---QQSPSPGGYQL 949
Cdd:PRK14086   90 PSAGEPAPPPPHAR-RTSEPE--------LPRPGRRPYEGYGGPRadDRPPGLPRQDQLPTARPAYpayQQRPEPGAWPR 160
                          90       100       110
                  ....*....|....*....|....*....|....*.
gi 291241653  950 TPSPGGYPMTPGAPSPGgfNPLTPGASLDSGSSEWQ 985
Cdd:PRK14086  161 AADDYGWQQQRLGFPPR--APYASPASYAPEQERDR 194
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
718-749 8.19e-05

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 40.45  E-value: 8.19e-05
                           10        20        30
                   ....*....|....*....|....*....|..
gi 291241653   718 IGQTVRIREGPFKGYIGIVKDATESTARVELH 749
Cdd:pfam00467    1 KGDVVRVIAGPFKGKVGKVVEVDDKKNRVLVE 32
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
851-994 1.19e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 46.02  E-value: 1.19e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  851 EPTPSPAYGGTPNPATPGYSADTPPSNGPYTPATPGSSTMYSSSDHTYSPYQHSSTPSPggyqgTPSPANYQPAPSPGGY 930
Cdd:PRK12323  449 APAPAPAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPPWEELPPEFA-----SPAPAQPDAAPAGWVA 523
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 291241653  931 QPTPSPAYQQsPSPGGYQLTPSPGGYPMTPGAPSPGGFNPLTPGASLDSGSS-----EWQTIHIEVKVK 994
Cdd:PRK12323  524 ESIPDPATAD-PDDAFETLAPAPAAAPAPRAAAATEPVVAPRPPRASASGLPdmfdgDWPALAARLPVR 591
KOW smart00739
KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.
489-516 1.25e-04

KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.


Pssm-ID: 128978  Cd Length: 28  Bit Score: 40.00  E-value: 1.25e-04
                            10        20
                    ....*....|....*....|....*...
gi 291241653    489 HFRMGDHVKVIGGRYEGDTGLIVRVEDN 516
Cdd:smart00739    1 KFEVGDTVRVIAGPFKGKVGKVLEVDGE 28
DUF4813 pfam16072
Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. ...
919-1028 1.32e-04

Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 345 and 672 amino acids in length.


Pssm-ID: 435117 [Multi-domain]  Cd Length: 288  Bit Score: 45.13  E-value: 1.32e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   919 ANYQPAPS---PGGYQPTPSPAYQQSPSPGGYQLTPS---PGG---YPMTPGAPSPGGFNPLTPGASLDSGSSEWQtihi 989
Cdd:pfam16072    2 ATYHPAGAtyhPGGYAPAGATYHPAGQVPAGATYYPSggvPHGatyYPQAPVAAVPAGATYLPAGAAIPAGATYYP---- 77
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|.
gi 291241653   990 evkvKATHEDSALIYKIGVIRGISGG--MCSVFIPDEGRVV 1028
Cdd:pfam16072   78 ----QAPKSSSGLGLGTGLIAGALGGaiLGHALTPTQTRVV 114
PHA03247 PHA03247
large tegument protein UL36; Provisional
821-966 1.39e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 46.47  E-value: 1.39e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  821 GSRTPGQTGAWDPTNRNTPAR------TDE---------------------FDYRFDEPTPSPAyggTPNPATPGYSADT 873
Cdd:PHA03247 2494 AAPDPGGGGPPDPDAPPAPSRlapailPDEpvgepvhprmltwirgleelaSDDAGDPPPPLPP---AAPPAAPDRSVPP 2570
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  874 P-PSNGPYTPATPGSSTMYSSsdhtysPYQHSSTPSPGGYQGTPsPANYQPAPSPGGYQPTPSPAYQQSP---SPGGYQL 949
Cdd:PHA03247 2571 PrPAPRPSEPAVTSRARRPDA------PPQSARPRAPVDDRGDP-RGPAPPSPLPPDTHAPDPPPPSPSPaanEPDPHPP 2643
                         170
                  ....*....|....*..
gi 291241653  950 TPSPGGyPMTPGAPSPG 966
Cdd:PHA03247 2644 PTVPPP-ERPRDDPAPG 2659
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
852-970 1.75e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 45.91  E-value: 1.75e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   852 PTPSPAYGGTPNPATPGYSADTP-PSNGPYTPATPGSSTMYSSSDHTyspyQHSSTPSPGGY-------QGTPSPANYQP 923
Cdd:pfam03154  243 PSPHPPLQPMTQPPPPSQVSPQPlPQPSLHGQMPPMPHSLQTGPSHM----QHPVPPQPFPLtpqssqsQVPPGPSPAAP 318
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|.
gi 291241653   924 APSPGGYQPTPSPAYQQSPSPGGYQ-LTPSPGGYPMT---PGAPSPGGFNP 970
Cdd:pfam03154  319 GQSQQRIHTPPSQSQLQSQQPPREQpLPPAPLSMPHIkppPTTPIPQLPNP 369
dnaA PRK14086
chromosomal replication initiator protein DnaA;
802-978 1.77e-04

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 45.59  E-value: 1.77e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  802 PLHDGSRTPHYGSQTplhdgsrTPGQTGAWDPTNRNTPARTDEFdyrfdePTPSPAYGGTPNPATPGYSADTPPSNGP-- 879
Cdd:PRK14086  103 RRTSEPELPRPGRRP-------YEGYGGPRADDRPPGLPRQDQL------PTARPAYPAYQQRPEPGAWPRAADDYGWqq 169
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  880 ----YTPATP-GSSTMYS-SSDHTYSPY--QHSSTPSPGGYQGTPSPANYQ--------PAPSPG-GYQPTPSPAYQQSP 942
Cdd:PRK14086  170 qrlgFPPRAPyASPASYApEQERDREPYdaGRPEYDQRRRDYDHPRPDWDRprrdrtdrPEPPPGaGHVHRGGPGPPERD 249
                         170       180       190
                  ....*....|....*....|....*....|....*.
gi 291241653  943 SPGGYQLTPSPGGYPMTPGAPSPGgfnPLTPGASLD 978
Cdd:PRK14086  250 DAPVVPIRPSAPGPLAAQPAPAPG---PGEPTARLN 282
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
767-982 2.13e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 45.55  E-value: 2.13e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  767 QPTSRGSTTSYRHTPLHPGQTPMYGSRTPmyGSQTPLHDGSRTPHYGSQTPLHDGSRTPGQTGAWDPTNRNTPARTDEFD 846
Cdd:PHA03307   80 PANESRSTPTWSLSTLAPASPAREGSPTP--PGPSSPDPPPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPAAGA 157
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  847 YRFDEPTPSPAYGGTPNPATPGYSADTPPSNGPYTPATPGSSTMYSSSDH----TYSPYQHSSTPSPGGYQGTPSPANYQ 922
Cdd:PHA03307  158 SPAAVASDAASSRQAALPLSSPEETARAPSSPPAEPPPSTPPAAASPRPPrrssPISASASSPAPAPGRSAADDAGASSS 237
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 291241653  923 PAPSPGGYQ---------PTPSPAYQQSPSPGGYQLTPSPGGYPMTPGAPSpGGFNPLTPGASLDSGSS 982
Cdd:PHA03307  238 DSSSSESSGcgwgpenecPLPRPAPITLPTRIWEASGWNGPSSRPGPASSS-SSPRERSPSPSPSSPGS 305
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
767-966 2.14e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 45.36  E-value: 2.14e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  767 QPTSRGSTTSYRH--TPLHPGQTPMYGSRTPMYGSQTPLHDGSRTPHYGSQTPLHDGSRTPGQTGAWDPTNRNTPARTde 844
Cdd:PRK07764  602 APASSGPPEEAARpaAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGGAAPA-- 679
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  845 fdyrfdEPTPSPAYGGTPNPATPGysadtPPSNGPYTPATPGSSTMYSSSDHTYSPYQHSSTPSPGGYQGTPSPA--NYQ 922
Cdd:PRK07764  680 ------APPPAPAPAAPAAPAGAA-----PAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPepDDP 748
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....
gi 291241653  923 PAPSPGGYQPTPSPAYQQSPSPGGYQLTPSPGGYPMTPGAPSPG 966
Cdd:PRK07764  749 PDPAGAPAQPPPPPAPAPAAAPAAAPPPSPPSEEEEMAEDDAPS 792
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
493-523 2.14e-04

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 39.29  E-value: 2.14e-04
                           10        20        30
                   ....*....|....*....|....*....|.
gi 291241653   493 GDHVKVIGGRYEGDTGLIVRVEDNMVILFSD 523
Cdd:pfam00467    2 GDVVRVIAGPFKGKVGKVVEVDDKKNRVLVE 32
KOW smart00739
KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.
611-637 3.12e-04

KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.


Pssm-ID: 128978  Cd Length: 28  Bit Score: 38.85  E-value: 3.12e-04
                            10        20
                    ....*....|....*....|....*..
gi 291241653    611 NIQVKDIVKVIDGPHSGRQGEVKHIYR 637
Cdd:smart00739    1 KFEVGDTVRVIAGPFKGKVGKVLEVDG 27
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
780-982 3.40e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 44.78  E-value: 3.40e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  780 TPLHPGQTPMYGSRTPMYGSQTPLHD--------GSRTPHYGSQTPLHDGSRTPGQTGAWDPTNRNTPARTDEFDYRFDE 851
Cdd:PHA03307   27 TPGDAADDLLSGSQGQLVSDSAELAAvtvvagaaACDRFEPPTGPPPGPGTEAPANESRSTPTWSLSTLAPASPAREGSP 106
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  852 PTPSPAYGGTPNPATPgySADTPPSNGPYTPATPGSSTMYSSSDHTYSPYQHSSTPSPGGYQGTPS----PANYQPAPSP 927
Cdd:PHA03307  107 TPPGPSSPDPPPPTPP--PASPPPSPAPDLSEMLRPVGSPGPPPAASPPAAGASPAAVASDAASSRqaalPLSSPEETAR 184
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|....*
gi 291241653  928 GGyqPTPSPAYQQSPSPGGYQLTPSPGGYPMTPGAPSPGGFNPLTPGASLDSGSS 982
Cdd:PHA03307  185 AP--SSPPAEPPPSTPPAAASPRPPRRSSPISASASSPAPAPGRSAADDAGASSS 237
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
788-988 3.46e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 44.78  E-value: 3.46e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  788 PMYGSRTPMYGSQTPLHDGSRTPhygsQTPLHDGSRTPGQTGAWDPTNRNTPArtdefdyrfDEPTPSPA---------- 857
Cdd:PHA03307   75 PGTEAPANESRSTPTWSLSTLAP----ASPAREGSPTPPGPSSPDPPPPTPPP---------ASPPPSPApdlsemlrpv 141
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  858 YGGTPNPATPGYSADTPPSNGPYTPATPGS----STMYSSSDHTYSPYQHSSTPSPGGYQGTPSP-------ANYQPAPS 926
Cdd:PHA03307  142 GSPGPPPAASPPAAGASPAAVASDAASSRQaalpLSSPEETARAPSSPPAEPPPSTPPAAASPRPprrsspiSASASSPA 221
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 291241653  927 PGgyqPTPSPAYQQSPSPGGYQLTPSPGgypmtpGAPSPGGFNPLTPGASLDSGSSEWQTIH 988
Cdd:PHA03307  222 PA---PGRSAADDAGASSSDSSSSESSG------CGWGPENECPLPRPAPITLPTRIWEASG 274
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
614-643 3.68e-04

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 38.91  E-value: 3.68e-04
                           10        20        30
                   ....*....|....*....|....*....|
gi 291241653   614 VKDIVKVIDGPHSGRQGEVKHIYRSFAFLH 643
Cdd:pfam00467    1 KGDVVRVIAGPFKGKVGKVVEVDDKKNRVL 30
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
854-985 4.11e-04

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 42.08  E-value: 4.11e-04
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653    854 PSPAYGGTP------NPATPgYSADTPPSNG----PYTPATPGSSTMYSSSDHTYSPYQHSSTPSPGGYQGTPSPANYQP 923
Cdd:smart00818   22 PYPSYGYEPmggwlhHQIIP-VSQQHPPTHTlqphHHIPVLPAQQPVVPQQPLMPVPGQHSMTPTQHHQPNLPQPAQQPF 100
                            90       100       110       120       130       140
                    ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 291241653    924 APSPGGYQPTPSPAYQQSPSPGGYQLTPSPGGYPMTPGAPSPggfnPLTPGASLDSgsseWQ 985
Cdd:smart00818  101 QPQPLQPPQPQQPMQPQPPVHPIPPLPPQPPLPPMFPMQPLP----PLLPDLPLEA----WP 154
motB PRK12799
flagellar motor protein MotB; Reviewed
854-953 4.32e-04

flagellar motor protein MotB; Reviewed


Pssm-ID: 183756 [Multi-domain]  Cd Length: 421  Bit Score: 43.94  E-value: 4.32e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  854 PSPAYGGTPNPATPGYSADTPPSNGPYTPATPGSSTMYSSSDHTYSPYQHSSTPSPGGYQGTPSPANYQPAPSPGGYQPT 933
Cdd:PRK12799  316 ITPSSAAIPSPAVIPSSVTTQSATTTQASAVALSSAGVLPSDVTLPGTVALPAAEPVNMQPQPMSTTETQQSSTGNITST 395
                          90       100
                  ....*....|....*....|
gi 291241653  934 PSPAYQQSPSPGGYQLTPSP 953
Cdd:PRK12799  396 ANGPTTSLPAAPASNIPVSP 415
PBP1 COG5180
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ...
832-978 4.93e-04

PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];


Pssm-ID: 444064 [Multi-domain]  Cd Length: 548  Bit Score: 43.90  E-value: 4.93e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  832 DPTNRNTPARTDEfdyrfdEPTPSPAYGGTPNPATPGysADTPPSNGPYTPATPGSSTMYSSSDH--TYSPYQHSSTPS- 908
Cdd:COG5180   239 TSEARSRPATVDA------QPEMRPPADAKERRRAAI--GDTPAAEPPGLPVLEAGSEPQSDAPEaeTARPIDVKGVASa 310
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  909 -PGGYQGTPSPANYQP-APSPG--GYQP--TPSPAYQQSPSPGGYQL-TPSPGGYPMTPGAPSPGGFN----PLTPGASL 977
Cdd:COG5180   311 pPATRPVRPPGGARDPgTPRPGqpTERPagVPEAASDAGQPPSAYPPaEEAVPGKPLEQGAPRPGSSGgdgaPFQPPNGA 390

                  .
gi 291241653  978 D 978
Cdd:COG5180   391 P 391
SP7_N cd22542
N-terminal domain of transcription factor Specificity Protein (SP) 7; Specificity Proteins ...
812-973 4.95e-04

N-terminal domain of transcription factor Specificity Protein (SP) 7; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP7, also called Osterix (Osx) in humans, is highly conserved among bone-forming vertebrates. It plays a major role, along with Runx2 and Dlx5 in driving the differentiation of mesenchymal precursor cells into osteoblasts and eventually osteocytes. SP7 also plays a regulatory role by inhibiting chondrocyte differentiation, maintaining the balance between differentiation of mesenchymal precursor cells into ossified bone or cartilage. Mutations of this gene have been associated with multiple dysfunctional bone phenotypes in vertebrates. SP7 is thought to play a role in diseases such as Osteogenesis imperfecta. SP7 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP7.


Pssm-ID: 411691 [Multi-domain]  Cd Length: 297  Bit Score: 43.35  E-value: 4.95e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  812 YGSQTPLHDgSRTPGQTGAWD-------------PTNRNTPARTDEFDYRFD------EPTPSP----AYGGTPNPatpg 868
Cdd:cd22542    26 FGGSSPIRD-SATPGKPGNNPgkkpyslgsdlssAKSRSSELMGDSYTATFSsgnglmSPSGSPqastTYGNDYNP---- 100
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  869 YSADTPPSNGPYTPATPGSST---------MYSSSD--HTYSPY-----QHSSTPSPGGYQGT----PSPANY-QPAPSP 927
Cdd:cd22542   101 FSHSFPTSSGSQDPSLLVSKGhpsadclpsVYTSLDmaHPYGSWyktgiHPGISSSSTNATASwwdmHSNTNWlSAQGQP 180
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*..
gi 291241653  928 GGYQPTPSPAYQQSP-SPggyQLTPSPGGYPMTPGAPSPGGFNPLTP 973
Cdd:cd22542   181 DGLQASLQPVPAQTPlNP---QLPSYTEFTTLNPAPYPAVGISSSSH 224
PHA03378 PHA03378
EBNA-3B; Provisional
853-970 5.44e-04

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 44.29  E-value: 5.44e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  853 TPSPAYGGTPNPATPGYSADTPPSNGPYTPAT--------PGSSTMYSSSDHTYSPYQHSSTPSP--------GGYQGT- 915
Cdd:PHA03378  587 SSAPSYAQTPWPVPHPSQTPEPPTTQSHIPETsaprqwpmPLRPIPMRPLRMQPITFNVLVFPTPhqppqveiTPYKPTw 666
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  916 --PSPANYQPAPS-----------PGGYQPTPSPAYQQSP--SPGGYQLTPSPGGYPMTPGAPSPGGFNP 970
Cdd:PHA03378  667 tqIGHIPYQPSPTgantmlpiqwaPGTMQPPPRAPTPMRPpaAPPGRAQRPAAATGRARPPAAAPGRARP 736
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
768-965 5.74e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 44.01  E-value: 5.74e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  768 PTSRGSTTSYRHTPLHPGQTPMYGSRTPMYGSqTPLHDGSRTPHYGSQTPLH--DGSRTPGQ--TGAWDPTNRNTPARTD 843
Cdd:PHA03307  242 SESSGCGWGPENECPLPRPAPITLPTRIWEAS-GWNGPSSRPGPASSSSSPRerSPSPSPSSpgSGPAPSSPRASSSSSS 320
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  844 EFDYRFDEPTPSPAYGGTPnPATPGYSADTPPSNGPYTPATPGSSTMYSSSDHTYSPY--QHSSTPSPGGYQGTPSPANY 921
Cdd:PHA03307  321 SRESSSSSTSSSSESSRGA-AVSPGPSPSRSPSPSRPPPPADPSSPRKRPRPSRAPSSpaASAGRPTRRRARAAVAGRAR 399
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*....
gi 291241653  922 Q---PAPSPGGyQPTPSPAYQQSPSPGGYQLTP--SPGGYPMtPGAPSP 965
Cdd:PHA03307  400 RrdaTGRFPAG-RPRPSPLDAGAASGAFYARYPllTPSGEPW-PGSPPP 446
PTZ00395 PTZ00395
Sec24-related protein; Provisional
755-984 6.12e-04

Sec24-related protein; Provisional


Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 44.30  E-value: 6.12e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  755 ISVDKNRLNPVNQPT-SRGSTTSYRhtplhpgqtpMYGSRTPmyGSQTPLHDGSRTPHYGSQTPL-HDGSRTPGQTGAW- 831
Cdd:PTZ00395  313 IQGDLVRGAPNDKNSfDRGNEKTYQ----------IYGGFHD--GSPNAASAGAPFNGLGNQADGgHINQVHPDARGAWa 380
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  832 -DPTNRNTPARTDEFDYRFDEPTPSPA-YGGTP--NP--ATPGYS----ADTPPSNGPYTpATPGSSTMYSSSDHTYSPY 901
Cdd:PTZ00395  381 gGPHSNASYNCAAYSNAAQSNAAQSNAgFSNAGysNPgnSNPGYNnapnSNTPYNNPPNS-NTPYSNPPNSNPPYSNLPY 459
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  902 QHSSTPSPGGYQGTPSPANYQPAPSPGGYQptpspaYQQSPSPGGYQLTPS-PGGYPMTPGAPSPGGfnplTPGASLDSG 980
Cdd:PTZ00395  460 SNTPYSNAPLSNAPPSSAKDHHSAYHAAYQ------HRAANQPAANLPTANqPAANNFHGAAGNSVG----NPFASRPFG 529

                  ....
gi 291241653  981 SSEW 984
Cdd:PTZ00395  530 SAPY 533
Pro-rich pfam15240
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
852-960 6.21e-04

Proline-rich protein; This family includes several eukaryotic proline-rich proteins.


Pssm-ID: 464580 [Multi-domain]  Cd Length: 167  Bit Score: 41.95  E-value: 6.21e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   852 PTPSPAYGGTPNPATPGYSADTPPSNG---PYTPATPGSSTMYSSSDHTYSPY---QHSSTPSPGGYQGTPSPANYQPaP 925
Cdd:pfam15240   55 PPQPPASDDPPGPPPPGGPQQPPPQGGkqkPQGPPPQGGPRPPPGKPQGPPPQggnQQQGPPPPGKPQGPPPQGGGPP-P 133
                           90       100       110
                   ....*....|....*....|....*....|....*..
gi 291241653   926 SPGGYQ--PTPSPAYQQSPSPggyqlTPSPGGYPMTP 960
Cdd:pfam15240  134 QGGNQQgpPPPPPGNPQGPPQ-----RPPQPGNPQGP 165
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
804-976 6.91e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 43.82  E-value: 6.91e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  804 HDGSRTPHYGSQTPLHDGSR--TPGQTGAWDPTNRNTPARTDEFDYRFDEPTPSPAYGGTPNPATPGYSADTPPSNGPYT 881
Cdd:PRK07764  595 AGGEGPPAPASSGPPEEAARpaAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAG 674
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  882 PATPGSSTMYSSSDHtyspyQHSSTPSPGGYQGTPSPANYQPAPSPGGYQPTPSPAYQQS----------PSPGGYQLTP 951
Cdd:PRK07764  675 GAAPAAPPPAPAPAA-----PAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASapspaaddpvPLPPEPDDPP 749
                         170       180
                  ....*....|....*....|....*
gi 291241653  952 SPGGYPMTPGAPsPGGFNPLTPGAS 976
Cdd:PRK07764  750 DPAGAPAQPPPP-PAPAPAAAPAAA 773
Pro-rich pfam15240
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
901-970 7.86e-04

Proline-rich protein; This family includes several eukaryotic proline-rich proteins.


Pssm-ID: 464580 [Multi-domain]  Cd Length: 167  Bit Score: 41.56  E-value: 7.86e-04
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 291241653   901 YQHSSTPSPGGYQGTPSPANYQPAPSPGGYQPTPSP--AYQQSPSPGGYQLTPSPGGYPMTPGAPSPGGFNP 970
Cdd:pfam15240   25 QEDSPSLISEEEGQSQQGGQGPQGPPPGGFPPQPPAsdDPPGPPPPGGPQQPPPQGGKQKPQGPPPQGGPRP 96
PHA03325 PHA03325
nuclear-egress-membrane-like protein; Provisional
814-960 8.40e-04

nuclear-egress-membrane-like protein; Provisional


Pssm-ID: 223044  Cd Length: 418  Bit Score: 42.95  E-value: 8.40e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  814 SQTPLHDGSRTPGQTGAWDPTNRNTPARTDEFDYRFDEPTPSPAYGGTPNPATP---GYSADTPPSNGPYT-----PATP 885
Cdd:PHA03325  266 SSLPTSAPKRRSRRAGAMRAAAGETADLADDDGSEHSDPEPLPASLPPPPVRRPrvkHPEAGKEEPDGARNaeakePAQP 345
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 291241653  886 GSSTmySSSDHTYSPYQHSSTPSPGGYQGTPSPANYQPAPSPGGYQPTPSPAYQQSPSPGGYQLTPSPggyPMTP 960
Cdd:PHA03325  346 ATST--SSKGSSSAQNKDSGSTGPGSSLAAASSFLEDDDFGSPPLDLTTSLRHMPSPSVTSAPEPPSI---PLTY 415
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
801-981 9.45e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 43.62  E-value: 9.45e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  801 TPLHDGSRTPHYGSQ----TPLHDGSRTPGQTGAwDPTNRNTPArtdefdyrfDEPTPSPaygGTPNPAtPGYSADTPPS 876
Cdd:PHA03307   26 ATPGDAADDLLSGSQgqlvSDSAELAAVTVVAGA-AACDRFEPP---------TGPPPGP---GTEAPA-NESRSTPTWS 91
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  877 NGPYTPATPgsstmysssdhtyspyQHSSTPSPGGYQGTPSPAnyqPAPSPGGYQPTPSPAYQQSPSPGGyqlTPSPGGY 956
Cdd:PHA03307   92 LSTLAPASP----------------AREGSPTPPGPSSPDPPP---PTPPPASPPPSPAPDLSEMLRPVG---SPGPPPA 149
                         170       180
                  ....*....|....*....|....*
gi 291241653  957 PMTPGAPSPGGFNPLTPGASLDSGS 981
Cdd:PHA03307  150 ASPPAAGASPAAVASDAASSRQAAL 174
CTD smart01104
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
866-982 1.02e-03

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif.


Pssm-ID: 215026 [Multi-domain]  Cd Length: 121  Bit Score: 40.20  E-value: 1.02e-03
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653    866 TPGYSADtppsnGPYTPATPgsstmysssdhtyspyqhSSTPSPGGYQGTPSPANYQPA-----------PSPGGYQPTP 934
Cdd:smart01104    4 TPAWGAS-----GSKTPAWG------------------SRTPGTAAGGAPTARGGSGSRtpawggagsrtPAWGGAGPTG 60
                            90       100       110       120       130       140
                    ....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653    935 S--PAYQQSPSPGGYQLTPSP----GGYPMTPGAPSPGG------FNPLTPGASLDSGSS 982
Cdd:smart01104   61 SrtPAWGGASAWGNKSSEGSAsswaAGPGGAYGAPTPGYggtpsaYGPATPGGGAMAGSA 120
PHA03378 PHA03378
EBNA-3B; Provisional
799-975 1.13e-03

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 43.13  E-value: 1.13e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  799 SQTPLHDGSRTPHY-GSQTPLHDGSRTPGqtgawDPTNRNTPARTDEFDYRFDEPTPSPAYGGTPNPATPGYSADTPPSN 877
Cdd:PHA03378  579 SPTTSQLASSAPSYaQTPWPVPHPSQTPE-----PPTTQSHIPETSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHQ 653
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  878 GPYTPATPGSSTmYSSSDHTysPYQHSSTPSPGGYQGTPSPANYQPAP-SPGGYQPTPSPAYQQSPSPGGYQLTPSPGGY 956
Cdd:PHA03378  654 PPQVEITPYKPT-WTQIGHI--PYQPSPTGANTMLPIQWAPGTMQPPPrAPTPMRPPAAPPGRAQRPAAATGRARPPAAA 730
                         170       180
                  ....*....|....*....|...
gi 291241653  957 P--MTPGAPSPGGFNP--LTPGA 975
Cdd:PHA03378  731 PgrARPPAAAPGRARPpaAAPGR 753
motB PRK12799
flagellar motor protein MotB; Reviewed
856-964 1.34e-03

flagellar motor protein MotB; Reviewed


Pssm-ID: 183756 [Multi-domain]  Cd Length: 421  Bit Score: 42.40  E-value: 1.34e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  856 PAYGGTPNPATPGYSADTPPSNGPYTPATPGSSTMYSSSDHTYSPYQHSSTPSPGGYQGTPSPANYQPAPSPGGYQPTPS 935
Cdd:PRK12799  300 PVAAVTPSSAVTQSSAITPSSAAIPSPAVIPSSVTTQSATTTQASAVALSSAGVLPSDVTLPGTVALPAAEPVNMQPQPM 379
                          90       100
                  ....*....|....*....|....*....
gi 291241653  936 PAYQQSPSPGGYQLTPSPGGYPMTPGAPS 964
Cdd:PRK12799  380 STTETQQSSTGNITSTANGPTTSLPAAPA 408
GATA-N pfam05349
GATA-type transcription activator, N-terminal; GATA transcription factors mediate cell ...
861-976 1.34e-03

GATA-type transcription activator, N-terminal; GATA transcription factors mediate cell differentiation in a diverse range of tissues. Mutation are often associated with certain congenital human disorders. The six classical vertebrate GATA proteins, GATA-1 to GATA-6, are highly homologous and have two tandem zinc fingers. The classical GATA transcription factors function transcription activators. In lower metazoans GATA proteins carry a single canonical zinc finger. This family represents the N-terminal domain of the family of GATA transcription activators.


Pssm-ID: 461628 [Multi-domain]  Cd Length: 174  Bit Score: 40.88  E-value: 1.34e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   861 TPNPATPGYSADTppsnGPYTPATPGS-----STMYSSSDHTYsPYQHSSTPSPGGYQGTPSPANYQPAPSPGGY---QP 932
Cdd:pfam05349    8 AANHGQAAYDHDS----GGFLHSAASSpvyvpTTRVPSMLPTL-PYLQGCGSSQQSHPVSSHSGWAQAGAESSSYnpgSP 82
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*
gi 291241653   933 TPSPAYQQSPSPggyqltpspggyPMTPGAPSPGGF-NPLTPGAS 976
Cdd:pfam05349   83 HPSPRFSYSHSP------------PGSNGTSRDAAYqSPLLISAG 115
KLF1_N cd21581
N-terminal domain of Kruppel-like Factor 1; Kruppel-like Factor 1 (KLF1, also known as ...
847-967 1.43e-03

N-terminal domain of Kruppel-like Factor 1; Kruppel-like Factor 1 (KLF1, also known as Krueppel-like factor 1 or Erythroid Kruppel-like Factor/EKLF) was the first Kruppel-like factor discovered. It was found to be vitally important for embryonic erythropoiesis in promoting the switch from fetal hemoglobin (Hemoglobin F) to adult hemoglobin (Hemoglobin A) gene expression by binding to highly conserved CACCC domains. EKLF ablation in mouse embryos produces a lethal anemic phenotype, causing death by embryonic day 14, and natural mutations lead to beta+ thalassemia in humans. However, expression of embryonic hemoglobin and fetal hemoglobin genes is normal in EKLF-deficient mice, suggesting other factors may be involved. KLF1 functions as a transcriptional activator. It belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF1, which is related to the N-terminal domains of KLF2 and KLF4.


Pssm-ID: 409227 [Multi-domain]  Cd Length: 278  Bit Score: 41.95  E-value: 1.43e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  847 YRFDEPTPSPAYGGTPNPATPGYSADTPPSNGPYTPATPGSStmYSSSDHtYSPYQHSS--TPSPGGYQGTPSPANYQPA 924
Cdd:cd21581   138 PHHGYPDAFVGPALFPAPANVDQFGFPQGGSVDRRGNLSKSG--SWDFGS-YYPQQHPSvvAFPDSRFGPLSGPQALTPD 214
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*..
gi 291241653  925 PSPGGY----QPTPSPAYQQSPSPGGYQLTPSPGGYPMTPGAPsPGG 967
Cdd:cd21581   215 PQHYGYfqlfRHNAALFPDYAHSPGPGHLPLGQQPLLPDPPLP-PGG 260
DUF4045 pfam13254
Domain of unknown function (DUF4045); This presumed domain is functionally uncharacterized. ...
849-976 1.48e-03

Domain of unknown function (DUF4045); This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is typically between 384 and 430 amino acids in length.


Pssm-ID: 433066 [Multi-domain]  Cd Length: 415  Bit Score: 42.46  E-value: 1.48e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   849 FDEPTP-----SPAYGG-TPNPATPGYSADTPPSN-GPYTPATPgsstmySSSDHTYSPYQHSSTPSPGGYQGTPSPANY 921
Cdd:pfam13254  200 FKEVTPvglmrSPAPGGhSKSPSVSGISADSSPTKeEPSEEADT------LSTDKEQSPAPTSASEPPPKTKELPKDSEE 273
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 291241653   922 QPAPSPGGYQPTPSPAYQQSPSPGGYQLTPSPggyPMTPGAPSPGGFNPLTPGAS 976
Cdd:pfam13254  274 PAAPSKSAEASTEKKEPDTESSPETSSEKSAP---SLLSPVSKASIDKPLSSPDR 325
PRK10263 PRK10263
DNA translocase FtsK; Provisional
853-965 1.55e-03

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 42.76  E-value: 1.55e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  853 TPSPAYGGTPN--PATPGYSADTPPSNGPYT-PATPGSSTMYSSSDHTYSPYQHSSTPSPGGYQGTPSPANYQPAPSPGG 929
Cdd:PRK10263  368 TGEPVIAPAPEgyPQQSQYAQPAVQYNEPLQqPVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPAPEQPVAGNAW 447
                          90       100       110
                  ....*....|....*....|....*....|....*.
gi 291241653  930 YQPTPSPAYQQSPSPGGYQLTPSPGGYPMTPGAPSP 965
Cdd:PRK10263  448 QAEEQQSTFAPQSTYQTEQTYQQPAAQEPLYQQPQP 483
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
825-970 1.58e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 42.56  E-value: 1.58e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  825 PGQTGAwdptnRNTPARTDEFDYRFDEPTPSPAYGGTPNPATPGYSADTPPSNGPYTPATPGSSTMYSSSDHTYSPYQHS 904
Cdd:PRK12323  365 PGQSGG-----GAGPATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQA 439
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 291241653  905 STPSPGGYQGtPSPAnyqPAPSPGgyqPTPSPAYQQSPSPGGYQLTPSPGGYPMTPGAPSPGGFNP 970
Cdd:PRK12323  440 SARGPGGAPA-PAPA---PAAAPA---AAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPP 498
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
852-979 1.60e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 42.67  E-value: 1.60e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  852 PTPSPAYGGTPNPATPGYSADTP-PSNGPYTPATPGSSTMYSSSDHTYSPyqhSSTPSPGGYQGTPsPANYQPAPSPGGy 930
Cdd:PRK07764  386 GVAGGAGAPAAAAPSAAAAAPAAaPAPAAAAPAAAAAPAPAAAPQPAPAP---APAPAPPSPAGNA-PAGGAPSPPPAA- 460
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*....
gi 291241653  931 QPTPSPAYQQSPSPggyQLTPSPGGYPMTPGAPSPGGFNPLTPGASLDS 979
Cdd:PRK07764  461 APSAQPAPAPAAAP---EPTAAPAPAPPAAPAPAAAPAAPAAPAAPAGA 506
SP7_N cd22542
N-terminal domain of transcription factor Specificity Protein (SP) 7; Specificity Proteins ...
812-900 2.05e-03

N-terminal domain of transcription factor Specificity Protein (SP) 7; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP7, also called Osterix (Osx) in humans, is highly conserved among bone-forming vertebrates. It plays a major role, along with Runx2 and Dlx5 in driving the differentiation of mesenchymal precursor cells into osteoblasts and eventually osteocytes. SP7 also plays a regulatory role by inhibiting chondrocyte differentiation, maintaining the balance between differentiation of mesenchymal precursor cells into ossified bone or cartilage. Mutations of this gene have been associated with multiple dysfunctional bone phenotypes in vertebrates. SP7 is thought to play a role in diseases such as Osteogenesis imperfecta. SP7 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP7.


Pssm-ID: 411691 [Multi-domain]  Cd Length: 297  Bit Score: 41.42  E-value: 2.05e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  812 YGS--QTPLHDGSRT--PGQTGAWDPTNRNT-----PARTDEFDYRFdEPTPSPAyggTPNPATPGYSADTPPSNGPYTP 882
Cdd:cd22542   141 YGSwyKTGIHPGISSssTNATASWWDMHSNTnwlsaQGQPDGLQASL-QPVPAQT---PLNPQLPSYTEFTTLNPAPYPA 216
                          90
                  ....*....|....*....
gi 291241653  883 ATPGSST-MYSSSDHTYSP 900
Cdd:cd22542   217 VGISSSShLLPSSQHMLSQ 235
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
852-969 2.10e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 42.28  E-value: 2.10e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  852 PTPSPAYGG----TPNPATPGYSADTPPSNGPYTPATPGSSTMYSSSDHTYSPYQHSSTPSPGGYQGTPSPANYQPAPSP 927
Cdd:PRK07764  396 AAAPSAAAAapaaAPAPAAAAPAAAAAPAPAAAPQPAPAPAPAPAPPSPAGNAPAGGAPSPPPAAAPSAQPAPAPAAAPE 475
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|..
gi 291241653  928 GGYQPTPSPAYQQSPSPggyqlTPSPggyPMTPGAPSPGGFN 969
Cdd:PRK07764  476 PTAAPAPAPPAAPAPAA-----APAA---PAAPAAPAGADDA 509
KLF4_N cd21582
N-terminal domain of Kruppel-like factor 4; Kruppel-like factor 4 (KLF4; also known as ...
859-960 2.96e-03

N-terminal domain of Kruppel-like factor 4; Kruppel-like factor 4 (KLF4; also known as Krueppel-like factor 4 or gut-enriched Kruppel-like factor/GKLF) is a protein that, in humans, is encoded by the KLF4 gene. Evidence also suggests that KLF4 is a tumor suppressor in certain cancers, including colorectal cancer, gastric cancer, esophageal squamous cell carcinoma, intestinal cancer, prostate cancer, bladder cancer and lung cancer. It may act as a tumor promoter where increased KLF4 expression has been reported, such as in oral squamous cell carcinoma and in primary breast ductal carcinoma. KLF4 is one of four key factors that are essential for inducing pluripotent stem cells. KLF4 is highly expressed in non-dividing cells and its overexpression induces cell cycle arrest. KLF proteins KLF1, KLF2, KLF4, KLF5, KLF6, and KLF7 are transcriptional activators. KLF4 belongs to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF4, which is related to the N-terminal domains of KLF1 and KLF2.


Pssm-ID: 409228 [Multi-domain]  Cd Length: 335  Bit Score: 41.22  E-value: 2.96e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653  859 GGTPNPATPGYSADTPPSNGPYTPATPGSSTMYS-SSDHTYSPYQHSSTPSPGGYQGTPSPANYQPapspggYQPTPSPA 937
Cdd:cd21582   228 GGNSQHGFSQRAPLPSRTTPSGGPGGGNSSTAESlMSRDHHPSSQVLSHPPLPLPQGYHPSPGYPP------FPPPPSQP 301
                          90       100
                  ....*....|....*....|...
gi 291241653  938 YQQspspggYQLTPSPGGYPMTP 960
Cdd:cd21582   302 QQY------QELMSPGSCLPEEP 318
DUF3729 pfam12526
Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins ...
862-965 2.99e-03

Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins in this family are typically between 145 and 1707 amino acids in length. The family is found in association with pfam01443, pfam01661, pfam05417, pfam01660, pfam00978. There is a single completely conserved residue L that may be functionally important.


Pssm-ID: 372164 [Multi-domain]  Cd Length: 115  Bit Score: 38.52  E-value: 2.99e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   862 PNPATPGYSADTPPSNGPYTPAtpgsstmysssdhtyspyqhSSTPSPGGYQGTPSPANYQPAPSPGGYQPTPSPAYQQS 941
Cdd:pfam12526   29 FSPPESAHPDPPPPVGDPRPPV--------------------VDTPPPVSAVWVLPPPSEPAAPEPDLVPPVTGPAGPPS 88
                           90       100
                   ....*....|....*....|....
gi 291241653   942 PspggyqLTPSPGGYPMTPGAPSP 965
Cdd:pfam12526   89 P------LAPPAPAQKPPLPPPRP 106
NusG COG0250
Transcription termination/antitermination protein NusG [Transcription];
718-748 3.39e-03

Transcription termination/antitermination protein NusG [Transcription];


Pssm-ID: 440020 [Multi-domain]  Cd Length: 171  Bit Score: 39.42  E-value: 3.39e-03
                          10        20        30
                  ....*....|....*....|....*....|.
gi 291241653  718 IGQTVRIREGPFKGYIGIVKDATESTARVEL 748
Cdd:COG0250   120 VGDRVRITDGPFAGFEGTVEEVDPEKGRVKV 150
Drf_FH1 pfam06346
Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs) ...
852-974 4.30e-03

Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs). It consists of low complexity repeats of around 12 residues.


Pssm-ID: 461881 [Multi-domain]  Cd Length: 157  Bit Score: 39.08  E-value: 4.30e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   852 PTPSPAYGGTPnPATPGYSADTPPS-----NGPYTPATPGSSTMYSSSDHTYSPYQHSSTPSPGGYQGTPSP------AN 920
Cdd:pfam06346    3 PPPLPGDSSTI-PLPPGACIPTPPPlpgggGPPPPPPLPGSAAIPPPPPLPGGTSIPPPPPLPGAASIPPPPplpgstGI 81
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 291241653   921 YQPAPSPGGYQPTPSPAyqqsPSPGGYQLTPSP----GGYPMTPGAPSPG--GFNPLTPG 974
Cdd:pfam06346   82 PPPPPLPGGAGIPPPPP----PLPGGAGVPPPPpplpGGPGIPPPPPFPGgpGIPPPPPG 137
KOW cd00380
KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known ...
719-758 4.60e-03

KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues.


Pssm-ID: 240504  Cd Length: 49  Bit Score: 36.04  E-value: 4.60e-03
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|....
gi 291241653  719 GQTVRIREGPFKGYIGIVKDATEST--ARVELHT--NCKTISVD 758
Cdd:cd00380     1 GDVVRVLRGPYKGREGVVVDIDPRFgiVTVKGATgsKGAELKVR 44
YppG pfam14179
YppG-like protein; The YppG-like protein family includes the B. subtilis YppG protein, which ...
901-953 5.60e-03

YppG-like protein; The YppG-like protein family includes the B. subtilis YppG protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 115 and 181 amino acids in length. There are two completely conserved residues (F and G) that may be functionally important.


Pssm-ID: 372950 [Multi-domain]  Cd Length: 101  Bit Score: 37.48  E-value: 5.60e-03
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 291241653   901 YQHSSTPSPGGYQG---TPSPANYQPAPSPGGYQPTPSPAYQQSPSPGGYQLTPSP 953
Cdd:pfam14179    1 YQHNSQPYPYFSQQvyqQPVQPQYPPFAPQQYMPQPPMPYMNPYPKQQPQQQQPSQ 56
KOW_NusG cd06091
NusG contains an NGN domain at its N-terminus and KOW motif at its C-terminus; KOW_NusG motif ...
611-638 6.50e-03

NusG contains an NGN domain at its N-terminus and KOW motif at its C-terminus; KOW_NusG motif is one of the two domains of N-Utilization Substance G (NusG) a transcription elongation and Rho-termination factor in bacteria and archaea. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The eukaryotic ortholog of NusG is Spt5 with multiple KOW motifs at its C-terminus.


Pssm-ID: 240515 [Multi-domain]  Cd Length: 56  Bit Score: 35.90  E-value: 6.50e-03
                          10        20
                  ....*....|....*....|....*...
gi 291241653  611 NIQVKDIVKVIDGPHSGRQGEVKHIYRS 638
Cdd:cd06091     3 DFEVGDTVRIISGPFAGFEGKVEEIDEE 30
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH