NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1731019548|gb|TYK07353|]
View 

pol protein [Cucumis melo var. makuwa]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
Glycosyltransferase_GTB-type super family cl10013
glycosyltransferase family 1 and related proteins with GTB topology; Glycosyltransferases ...
1-487 0e+00

glycosyltransferase family 1 and related proteins with GTB topology; Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility.


The actual alignment was detected with superfamily member PLN02534:

Pssm-ID: 471961 [Multi-domain]  Cd Length: 491  Bit Score: 723.19  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548    1 MASTLSNQLelqpHFVLVPLMAQGHMIPMIDIATLLARRGVFVTFVTTPYNATRLESFFARAKQSSLSISLLEIPFPCLQ 80
Cdd:PLN02534     1 KAVSKAKQL----HFVLIPLMAQGHMIPMIDMARLLAERGVIVSLVTTPQNASRFAKTIDRARESGLPIRLVQIPFPCKE 76
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   81 VGLPLGCENLDTLPSRSLLRNFYKALSLLQQPLEQFLSRHHLNPTCIISDKYLYWTAQTAHKFKCPRVVFHGTGCFSLLS 160
Cdd:PLN02534    77 VGLPIGCENLDTLPSRDLLRKFYDAVDKLQQPLERFLEQAKPPPSCIISDKCLSWTSKTAQRFNIPRIVFHGMCCFSLLS 156
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  161 SHNLQLYSPHTSIDSNSQPFLVPGLPHKIEITKSQLPGSLIKSPDFDDFRDKITKAEQEAYGVVVNSFSELENGYYQNYE 240
Cdd:PLN02534   157 SHNIRLHNAHLSVSSDSEPFVVPGMPQSIEITRAQLPGAFVSLPDLDDVRNKMREAESTAFGVVVNSFNELEHGCAEAYE 236
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  241 RAISKKLWCIGPVSLCNENSIEKYNRGNKASIEQSNCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLGQCLESSTRPFI 320
Cdd:PLN02534   237 KAIKKKVWCVGPVSLCNKRNLDKFERGNKASIDETQCLEWLDSMKPRSVIYACLGSLCRLVPSQLIELGLGLEASKKPFI 316
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  321 WVIKNrDENCSELEKWLSEEEFERKTKGRGLIIRGWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQF 400
Cdd:PLN02534   317 WVIKT-GEKHSELEEWLVKENFEERIKGRGLLIKGWAPQVLILSHPAIGGFLTHCGWNSTIEGICSGVPMITWPLFAEQF 395
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  401 LNEKLVVEILKIGVRVGVEGAVRWGEEERVGVMAKKEEIEKAIEMVMDG-GEEGEERRRRVGDLSKMAPKAMENGGSSYV 479
Cdd:PLN02534   396 LNEKLIVEVLRIGVRVGVEVPVRWGDEERVGVLVKKDEVEKAVKTLMDDgGEEGERRRRRAQELGVMARKAMELGGSSHI 475

                   ....*...
gi 1731019548  480 NLSLFIED 487
Cdd:PLN02534   476 NLSILIQD 483
RT_LTR cd01647
RT_LTR: Reverse transcriptases (RTs) from retrotransposons and retroviruses which have long ...
1416-1592 2.07e-94

RT_LTR: Reverse transcriptases (RTs) from retrotransposons and retroviruses which have long terminal repeats (LTRs) in their DNA copies but not in their RNA template. RT catalyzes DNA replication from an RNA template, and is responsible for the replication of retroelements. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs are present in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and Caulimoviruses.


:

Pssm-ID: 238825  Cd Length: 177  Bit Score: 302.98  E-value: 2.07e-94
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1416 GFIRPSVSPWGAPVLFVKKKDGSMRLCIDYRELNKVTVKNRYPLPRIDDLFDQLQGATVFSKIDLRSGYHQLRIRDGDIP 1495
Cdd:cd01647      1 GIIEPSSSPYASPVVVVKKKDGKLRLCVDYRKLNKVTIKDRYPLPTIDELLEELAGAKVFSKLDLRSGYHQIPLAEESRP 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1496 KTAFRSRYGHYEFVVMSFGLTNAPAVFMDLMNRVFKEFLDSFVIVFIDDILIYSKTEAEHEEHLHQVLETLRANKLYAKF 1575
Cdd:cd01647     81 KTAFRTPFGLYEYTRMPFGLKNAPATFQRLMNKILGDLLGDFVEVYLDDILVYSKTEEEHLEHLREVLERLREAGLKLNP 160
                          170
                   ....*....|....*..
gi 1731019548 1576 SKCEFWLRKVTFLGHVV 1592
Cdd:cd01647    161 EKCEFGVPEVEFLGHIV 177
RNase_HI_RT_Ty3 cd09274
Ty3/Gypsy family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1686-1801 1.30e-57

Ty3/Gypsy family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1 and the vertebrate retroviruses. Ty3/Gypsy family widely distributed among the genomes of plants, fungi and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


:

Pssm-ID: 260006 [Multi-domain]  Cd Length: 121  Bit Score: 195.02  E-value: 1.30e-57
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1686 VIYSDASKKGLGCVLMQ-----QGKVVAYASRQLKIHEQNYPTHDLELAAVVFALKIWRHYLYGEKIQIYTDHKSLKYFF 1760
Cdd:cd09274      1 ILETDASDYGIGAVLSQedddgKERPIAFFSRKLTPAERNYSTTEKELLAIVWALKKFRHYLLGRPFTVYTDHKALKYLL 80
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|.
gi 1731019548 1761 TQKELNMRQRRWLELVKDYDCEILYHPGKANVVADALSRKV 1801
Cdd:cd09274     81 TQKDLNGRLARWLLLLSEFDFEIEYRPGKENVVADALSRLP 121
pepsin_retropepsin_like super family cl11403
Cellular and retroviral pepsin-like aspartate proteases; This family includes both cellular ...
1173-1304 3.97e-32

Cellular and retroviral pepsin-like aspartate proteases; This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site ASP residues with each N- and C-terminal lobe contributing one residue. While the fungal and mammalian pepsins are bilobal proteins, retropepsins function as dimers and the monomer resembles structure of the N- or C-terminal domains of eukaryotic enzyme. The active site motif (Asp-Thr/Ser-Gly-Ser) is conserved between the retroviral and eukaryotic proteases and between the N-and C-terminal of eukaryotic pepsin-like proteases. The retropepsin-like family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements; as well as eukaryotic DNA-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. Retropepsin is synthesized as part of the POL polyprotein that contains an aspartyl-protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A) and A2 (retropepsin family).


The actual alignment was detected with superfamily member pfam08284:

Pssm-ID: 472175  Cd Length: 134  Bit Score: 122.92  E-value: 3.97e-32
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1173 QGRVFATTRQEAERAGTVVTGTLPILGHYAFVLFDSGSSHSFISSVFVQHVGLEVEPLGSVLSVSTPSGEVLLSKEQIKA 1252
Cdd:pfam08284    2 QGRVNHLSAEEAEASPDVIQGTFLVNSIPATVLFDSGATHSFISHAFVGKLKLPVESLSNPLCIETPTGGSVTTNLICPS 81
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|..
gi 1731019548 1253 CRVEIANRMLDVTLLVLDMQDFDVILGMDWLSANHANIDCYGKEVVFNPPSE 1304
Cdd:pfam08284   82 CPIEIQGISFLADLILLDMKDLDVILGMDWLSKNKANIDCARRTVTLTKERE 133
Retrotrans_gag pfam03732
Retrotransposon gag protein; Gag or Capsid-like proteins from LTR retrotransposons. There is a ...
943-1038 4.67e-20

Retrotransposon gag protein; Gag or Capsid-like proteins from LTR retrotransposons. There is a central motif QGXXEXXXXXFXXLXXH that is common to Retroviridae gag-proteins, but is poorly conserved.


:

Pssm-ID: 367628  Cd Length: 97  Bit Score: 87.00  E-value: 4.67e-20
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  943 AVFFLEDRGTAWWETAERmlGGDVSKITWEQFKENFYAKFFSANVKHAKLQEFLNLEQGDMTVEQYDAEFDMLSRFAPDM 1022
Cdd:pfam03732    3 AVHSLRGAALTWWKSLVA--RSIDAFDSWDELKDAFLKRFFPSIRKDLLRNELRSLRQGTESVREYVERFKRLARQLPHH 80
                           90
                   ....*....|....*.
gi 1731019548 1023 VRDEAARTEKFVRGLR 1038
Cdd:pfam03732   81 GRDEEALISAFLRGLR 96
Integrase_H2C2 pfam17921
Integrase zinc binding domain; This zinc binding domain is found in a wide variety of ...
1891-1949 5.79e-16

Integrase zinc binding domain; This zinc binding domain is found in a wide variety of integrase proteins.


:

Pssm-ID: 465569 [Multi-domain]  Cd Length: 58  Bit Score: 73.82  E-value: 5.79e-16
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 1731019548 1891 DSAVKTELLTEAHSSpfTMHPGSTKMYQDLRSVYWWRGMKRDVADFVSRCLVCQQVKAP 1949
Cdd:pfam17921    2 PKSLRKEILKEAHDS--GGHLGIEKTLARLRRRYWWPGMRKDVKKYVKSCETCQRRKPS 58
PHA03378 super family cl33729
EBNA-3B; Provisional
497-672 4.35e-07

EBNA-3B; Provisional


The actual alignment was detected with superfamily member PHA03378:

Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 55.46  E-value: 4.35e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  497 AQPSRNRNRAVAELPAAPGESSKPICAAE--RPSEQPSLARRsrveaasaRPQPTRQPAEsrrfdPTQAQPKRQSVAVAE 574
Cdd:PHA03378   674 YQPSPTGANTMLPIQWAPGTMQPPPRAPTpmRPPAAPPGRAQ--------RPAAATGRAR-----PPAAAPGRARPPAAA 740
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  575 RSRGSRTTNSRRAARNPEAEASRAKPSRVSPVASRAASPPVASRAAREIP--VPSRAASAQAEPV-LSSSSRAARTKLAP 651
Cdd:PHA03378   741 PGRARPPAAAPGRARPPAAAPGRARPPAAAPGAPTPQPPPQAPPAPQQRPrgAPTPQPPPQAGPTsMQLMPRAAPGQQGP 820
                          170       180
                   ....*....|....*....|.
gi 1731019548  652 STYILRGISSVGARALQPPSR 672
Cdd:PHA03378   821 TKQILRQLLTGGVKRGRPSLK 841
transpos_IS481 super family cl41329
IS481 family transposase; null
1950-2078 3.65e-05

IS481 family transposase; null


The actual alignment was detected with superfamily member NF033577:

Pssm-ID: 468094 [Multi-domain]  Cd Length: 283  Bit Score: 47.97  E-value: 3.65e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1950 RQHPAGLLQplsvpgwkwesvsMDfITGLPKTL-RGYTVIWVVVDRLTKSAH--FVPGKSTYTASKwgqlYMTEIVRLHG 2026
Cdd:NF033577   124 RAHPGELWH-------------ID-IKKLGRIPdVGRLYLHTAIDDHSRFAYaeLYPDETAETAAD----FLRRAFAEHG 185
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 1731019548 2027 VPV-SIISDRDARFTSKFwKGLQLAL---GTRLDFSTAFHPQTDGQTERLNQILED 2078
Cdd:NF033577   186 IPIrRVLTDNGSEFRSRA-HGFELALaelGIEHRRTRPYHPQTNGKVERFHRTLKD 240
CD_CSD super family cl28914
CHROMO (CHRromatin Organization Modifier) domains and chromo shadow domains; Members of this ...
2268-2315 3.58e-03

CHROMO (CHRromatin Organization Modifier) domains and chromo shadow domains; Members of this group are chromodomains or chromo shadow domains; these are SH3-fold-beta-barrel domains of the chromo-like superfamily. Chromodomains lack the first strand of the SH3-fold-beta-barrel, this first strand is altered by insertion in the chromo shadow domains. The chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. Chromodomain-containing proteins include: i) those having an N-terminal chromodomain followed by a related chromo shadow domain, such as Drosophila and human heterochromatin protein Su(var)205 (HP1), and mammalian modifier 1 and 2; ii) those having a single chromodomain, such as Drosophila protein Polycomb (Pc), mammalian modifier 3, human Mi-2 autoantigen, and several yeast and Caenorhabditis elegans proteins of unknown function; iii) those having paired tandem chromodomains, such as mammalian DNA-binding/helicase proteins CHD-1 to CHD-4 and yeast protein CHD1; (iv) and elongation factor eEF3, a member of the ATP-binding cassette (ABC) family of proteins, that serves an essential function in the translation cycle of fungi. eEF3 is a soluble factor lacking a transmembrane domain and having two ABC domains arranged in tandem, with a unique chromodomain inserted within the ABC2 domain.


The actual alignment was detected with superfamily member cd18979:

Pssm-ID: 475127  Cd Length: 48  Bit Score: 37.47  E-value: 3.58e-03
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*...
gi 1731019548 2268 QPVEVLAREVKKLRSREIpLVKilWQNHGVEEATWEKEEDMRAQYPEL 2315
Cdd:cd18979      2 FPEKVLDIRQRDKGNKEF-LVQ--WQGLSVEEATWEPYKDLVQQFPDF 46
 
Name Accession Description Interval E-value
PLN02534 PLN02534
UDP-glycosyltransferase
1-487 0e+00

UDP-glycosyltransferase


Pssm-ID: 215293 [Multi-domain]  Cd Length: 491  Bit Score: 723.19  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548    1 MASTLSNQLelqpHFVLVPLMAQGHMIPMIDIATLLARRGVFVTFVTTPYNATRLESFFARAKQSSLSISLLEIPFPCLQ 80
Cdd:PLN02534     1 KAVSKAKQL----HFVLIPLMAQGHMIPMIDMARLLAERGVIVSLVTTPQNASRFAKTIDRARESGLPIRLVQIPFPCKE 76
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   81 VGLPLGCENLDTLPSRSLLRNFYKALSLLQQPLEQFLSRHHLNPTCIISDKYLYWTAQTAHKFKCPRVVFHGTGCFSLLS 160
Cdd:PLN02534    77 VGLPIGCENLDTLPSRDLLRKFYDAVDKLQQPLERFLEQAKPPPSCIISDKCLSWTSKTAQRFNIPRIVFHGMCCFSLLS 156
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  161 SHNLQLYSPHTSIDSNSQPFLVPGLPHKIEITKSQLPGSLIKSPDFDDFRDKITKAEQEAYGVVVNSFSELENGYYQNYE 240
Cdd:PLN02534   157 SHNIRLHNAHLSVSSDSEPFVVPGMPQSIEITRAQLPGAFVSLPDLDDVRNKMREAESTAFGVVVNSFNELEHGCAEAYE 236
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  241 RAISKKLWCIGPVSLCNENSIEKYNRGNKASIEQSNCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLGQCLESSTRPFI 320
Cdd:PLN02534   237 KAIKKKVWCVGPVSLCNKRNLDKFERGNKASIDETQCLEWLDSMKPRSVIYACLGSLCRLVPSQLIELGLGLEASKKPFI 316
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  321 WVIKNrDENCSELEKWLSEEEFERKTKGRGLIIRGWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQF 400
Cdd:PLN02534   317 WVIKT-GEKHSELEEWLVKENFEERIKGRGLLIKGWAPQVLILSHPAIGGFLTHCGWNSTIEGICSGVPMITWPLFAEQF 395
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  401 LNEKLVVEILKIGVRVGVEGAVRWGEEERVGVMAKKEEIEKAIEMVMDG-GEEGEERRRRVGDLSKMAPKAMENGGSSYV 479
Cdd:PLN02534   396 LNEKLIVEVLRIGVRVGVEVPVRWGDEERVGVLVKKDEVEKAVKTLMDDgGEEGERRRRRAQELGVMARKAMELGGSSHI 475

                   ....*...
gi 1731019548  480 NLSLFIED 487
Cdd:PLN02534   476 NLSILIQD 483
RT_LTR cd01647
RT_LTR: Reverse transcriptases (RTs) from retrotransposons and retroviruses which have long ...
1416-1592 2.07e-94

RT_LTR: Reverse transcriptases (RTs) from retrotransposons and retroviruses which have long terminal repeats (LTRs) in their DNA copies but not in their RNA template. RT catalyzes DNA replication from an RNA template, and is responsible for the replication of retroelements. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs are present in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and Caulimoviruses.


Pssm-ID: 238825  Cd Length: 177  Bit Score: 302.98  E-value: 2.07e-94
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1416 GFIRPSVSPWGAPVLFVKKKDGSMRLCIDYRELNKVTVKNRYPLPRIDDLFDQLQGATVFSKIDLRSGYHQLRIRDGDIP 1495
Cdd:cd01647      1 GIIEPSSSPYASPVVVVKKKDGKLRLCVDYRKLNKVTIKDRYPLPTIDELLEELAGAKVFSKLDLRSGYHQIPLAEESRP 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1496 KTAFRSRYGHYEFVVMSFGLTNAPAVFMDLMNRVFKEFLDSFVIVFIDDILIYSKTEAEHEEHLHQVLETLRANKLYAKF 1575
Cdd:cd01647     81 KTAFRTPFGLYEYTRMPFGLKNAPATFQRLMNKILGDLLGDFVEVYLDDILVYSKTEEEHLEHLREVLERLREAGLKLNP 160
                          170
                   ....*....|....*..
gi 1731019548 1576 SKCEFWLRKVTFLGHVV 1592
Cdd:cd01647    161 EKCEFGVPEVEFLGHIV 177
GT1_Gtf-like cd03784
UDP-glycosyltransferases and similar proteins; This family includes the Gtfs, a group of ...
13-448 5.51e-67

UDP-glycosyltransferases and similar proteins; This family includes the Gtfs, a group of homologous glycosyltransferases involved in the final stages of the biosynthesis of antibiotics vancomycin and related chloroeremomycin. Gtfs transfer sugar moieties from an activated NDP-sugar donor to the oxidatively cross-linked heptapeptide core of vancomycin group antibiotics. The core structure is important for the bioactivity of the antibiotics.


Pssm-ID: 340817 [Multi-domain]  Cd Length: 404  Bit Score: 233.21  E-value: 5.51e-67
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   13 PHFVLVPLMAQGHMIPMIDIATLLARRGVFVTFVTTPYNatrLESFFARAKQSSLSISLLEIPFPCLQVglplgcENLDT 92
Cdd:cd03784      1 MRILFVPFPGQGHVNPMLPLAKALAARGHEVTVATPPFN---FADLVEAAGLTFVPVGDDPDELELDSE------TNLGP 71
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   93 LPSRSLLRNFYKALSLLQQPLEQFLsRHHLNPTCIISDKYLYWTAQTAHKFKCPRVVFHgtgcfsllsshnlqlysphts 172
Cdd:cd03784     72 DSLLELLRRLLKAADELLDDLLAAL-RSSWKPDLVIADPFAYAGPLVAEELGIPSVRLF--------------------- 129
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  173 idsnSQPFLVPGLPHKIEITKSQLPGSLIKSPDFDDFRDKITKAEQEAYGVVVNSFSelengyyqNYERAISKKLWCIGP 252
Cdd:cd03784    130 ----TGPATLLSAYLHPFGVLNLLLSSLLEPELFLDPLLEVLDRLRERLGLPPFSLV--------LLLLRLVPPLYVIGP 197
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  253 VSLCNE-------NSIEKYNRGNKASIEQSNCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLG-QCLESSTRPFIWVIK 324
Cdd:cd03784    198 TFPSLPpdrprlpSVLGGLRIVPKNGPLPDELWEWLDKQPPRSVVYVSFGSMVRDLPEELLELIaEALASLGQRFLWVVG 277
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  325 NrdencselekwlSEEEFERKTKGRGLIIRgWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQFLNeK 404
Cdd:cd03784    278 P------------DPLGGLERLPDNVLVVK-WVPQDELLAHPAVGAFVTHGGWNSTLEALYAGVPMVVVPLFADQPNN-A 343
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|....
gi 1731019548  405 LVVEILKIGVRVGVEgavrwgeeervgvMAKKEEIEKAIEMVMD 448
Cdd:cd03784    344 ARVEELGAGVELDKD-------------ELTAEELAKAVREVLE 374
RNase_HI_RT_Ty3 cd09274
Ty3/Gypsy family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1686-1801 1.30e-57

Ty3/Gypsy family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1 and the vertebrate retroviruses. Ty3/Gypsy family widely distributed among the genomes of plants, fungi and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260006 [Multi-domain]  Cd Length: 121  Bit Score: 195.02  E-value: 1.30e-57
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1686 VIYSDASKKGLGCVLMQ-----QGKVVAYASRQLKIHEQNYPTHDLELAAVVFALKIWRHYLYGEKIQIYTDHKSLKYFF 1760
Cdd:cd09274      1 ILETDASDYGIGAVLSQedddgKERPIAFFSRKLTPAERNYSTTEKELLAIVWALKKFRHYLLGRPFTVYTDHKALKYLL 80
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|.
gi 1731019548 1761 TQKELNMRQRRWLELVKDYDCEILYHPGKANVVADALSRKV 1801
Cdd:cd09274     81 TQKDLNGRLARWLLLLSEFDFEIEYRPGKENVVADALSRLP 121
RT_RNaseH pfam17917
RNase H-like domain found in reverse transcriptase; DNA polymerase and ribonuclease H (RNase H) ...
1685-1775 2.42e-41

RNase H-like domain found in reverse transcriptase; DNA polymerase and ribonuclease H (RNase H) activities allow reverse transcriptases to convert the single-stranded retroviral RNA genome into double-stranded DNA, which is integrated into the host chromosome during infection. This entry represents the RNase H like domain.


Pssm-ID: 465565  Cd Length: 104  Bit Score: 148.04  E-value: 2.42e-41
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1685 FVIYSDASKKGLGCVLMQQG-----KVVAYASRQLKIHEQNYPTHDLELAAVVFALKIWRHYLYGEKIQIYTDHKSLKYF 1759
Cdd:pfam17917    6 FILETDASDYGIGAVLSQKDedgkeRPIAYASRKLTPAERNYSTTEKELLAIVWALKKFRHYLLGRKFTVYTDHKPLKYL 85
                           90
                   ....*....|....*.
gi 1731019548 1760 FTQKELNMRQRRWLEL 1775
Cdd:pfam17917   86 FTPKELNGRLARWALF 101
RVT_1 pfam00078
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
1432-1592 6.27e-38

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses.


Pssm-ID: 395031 [Multi-domain]  Cd Length: 189  Bit Score: 141.67  E-value: 6.27e-38
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1432 VKKKD-GSMRLC----IDYRELNKVTVK-------NRYPLPRIDDLFDQLQGATVFSKIDLRSGYHQLRIRDGDIPKTAF 1499
Cdd:pfam00078    1 IPKKGkGKYRPIsllsIDYKALNKIIVKrlkpenlDSPPQPGFRPGLAKLKKAKWFLKLDLKKAFDQVPLDELDRKLTAF 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1500 R-----------SRYGHYEFVVMSFGLTNAPAVFMDLMNRVFKEFL---DSFVIVFIDDILIYSKTEAEHEEHLHQVLET 1565
Cdd:pfam00078   81 TtppininwngeLSGGRYEWKGLPQGLVLSPALFQLFMNELLRPLRkraGLTLVRYADDILIFSKSEEEHQEALEEVLEW 160
                          170       180
                   ....*....|....*....|....*....
gi 1731019548 1566 LRANKLYAKFSKCEFWL--RKVTFLGHVV 1592
Cdd:pfam00078  161 LKESGLKINPEKTQFFLksKEVKYLGVTL 189
RVP_2 pfam08284
Retroviral aspartyl protease; Single domain aspartyl proteases from retroviruses, ...
1173-1304 3.97e-32

Retroviral aspartyl protease; Single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases.


Pssm-ID: 400537  Cd Length: 134  Bit Score: 122.92  E-value: 3.97e-32
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1173 QGRVFATTRQEAERAGTVVTGTLPILGHYAFVLFDSGSSHSFISSVFVQHVGLEVEPLGSVLSVSTPSGEVLLSKEQIKA 1252
Cdd:pfam08284    2 QGRVNHLSAEEAEASPDVIQGTFLVNSIPATVLFDSGATHSFISHAFVGKLKLPVESLSNPLCIETPTGGSVTTNLICPS 81
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|..
gi 1731019548 1253 CRVEIANRMLDVTLLVLDMQDFDVILGMDWLSANHANIDCYGKEVVFNPPSE 1304
Cdd:pfam08284   82 CPIEIQGISFLADLILLDMKDLDVILGMDWLSKNKANIDCARRTVTLTKERE 133
Retrotrans_gag pfam03732
Retrotransposon gag protein; Gag or Capsid-like proteins from LTR retrotransposons. There is a ...
943-1038 4.67e-20

Retrotransposon gag protein; Gag or Capsid-like proteins from LTR retrotransposons. There is a central motif QGXXEXXXXXFXXLXXH that is common to Retroviridae gag-proteins, but is poorly conserved.


Pssm-ID: 367628  Cd Length: 97  Bit Score: 87.00  E-value: 4.67e-20
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  943 AVFFLEDRGTAWWETAERmlGGDVSKITWEQFKENFYAKFFSANVKHAKLQEFLNLEQGDMTVEQYDAEFDMLSRFAPDM 1022
Cdd:pfam03732    3 AVHSLRGAALTWWKSLVA--RSIDAFDSWDELKDAFLKRFFPSIRKDLLRNELRSLRQGTESVREYVERFKRLARQLPHH 80
                           90
                   ....*....|....*.
gi 1731019548 1023 VRDEAARTEKFVRGLR 1038
Cdd:pfam03732   81 GRDEEALISAFLRGLR 96
Integrase_H2C2 pfam17921
Integrase zinc binding domain; This zinc binding domain is found in a wide variety of ...
1891-1949 5.79e-16

Integrase zinc binding domain; This zinc binding domain is found in a wide variety of integrase proteins.


Pssm-ID: 465569 [Multi-domain]  Cd Length: 58  Bit Score: 73.82  E-value: 5.79e-16
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 1731019548 1891 DSAVKTELLTEAHSSpfTMHPGSTKMYQDLRSVYWWRGMKRDVADFVSRCLVCQQVKAP 1949
Cdd:pfam17921    2 PKSLRKEILKEAHDS--GGHLGIEKTLARLRRRYWWPGMRKDVKKYVKSCETCQRRKPS 58
UDPGT pfam00201
UDP-glucoronosyl and UDP-glucosyl transferase;
289-448 7.84e-12

UDP-glucoronosyl and UDP-glucosyl transferase;


Pssm-ID: 278624 [Multi-domain]  Cd Length: 499  Bit Score: 70.13  E-value: 7.84e-12
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  289 VLYICLGSLCRMLPSQ-LIQLGQCLESSTRPFIWviknrdencselekwlSEEEFERKTKGRGLIIRGWAPQLLILSHWS 367
Cdd:pfam00201  277 VVVFSLGSMVSNIPEEkANAIASALAQIPQKVLW----------------RFDGTKPSTLGNNTRLVKWLPQNDLLGHPK 340
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  368 TGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQFLNEKLVVEilkigvrVGVEGAVRWGEeervgvMAkKEEIEKAIEMVM 447
Cdd:pfam00201  341 TRAFITHAGSNGVYEAICHGVPMVGMPLFGDQMDNAKHMEA-------KGAAVTLNVLT------MT-SEDLLNALKEVI 406

                   .
gi 1731019548  448 D 448
Cdd:pfam00201  407 N 407
retropepsin_like cd00303
Retropepsins; pepsin-like aspartate proteases; The family includes pepsin-like aspartate ...
1219-1283 9.35e-10

Retropepsins; pepsin-like aspartate proteases; The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements, as well as eukaryotic dna-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.


Pssm-ID: 133136  Cd Length: 92  Bit Score: 57.35  E-value: 9.35e-10
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1731019548 1219 FVQHVGLEVEPLGSVLSVSTPSGEVLLSKEQIKACRVEIANRMLDVTLLVLDMQDFDVILGMDWL 1283
Cdd:cd00303     27 LAKKLGLPPRLLPTPLKVKGANGSSVKTLGVILPVTIGIGGKTFTVDFYVLDLLSYDVILGRPWL 91
MGT TIGR01426
glycosyltransferase, MGT family; This model describes the MGT (macroside glycosyltransferase) ...
352-409 6.59e-09

glycosyltransferase, MGT family; This model describes the MGT (macroside glycosyltransferase) subfamily of the UDP-glucuronosyltransferase family. Members include a number of glucosyl transferases for macrolide antibiotic inactivation, but also include transferases of glucose-related sugars for macrolide antibiotic production. [Cellular processes, Toxin production and resistance]


Pssm-ID: 273616 [Multi-domain]  Cd Length: 392  Bit Score: 60.47  E-value: 6.59e-09
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1731019548  352 IIRGWAPQLLILSHwsTGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQFLNEKLVVEI 409
Cdd:TIGR01426  278 EVRQWVPQLEILKK--ADAFITHGGMNSTMEALFNGVPMVAVPQGADQPMTARRIAEL 333
YjiC COG1819
UDP:flavonoid glycosyltransferase YjiC, YdhE family [Carbohydrate transport and metabolism];
351-416 8.15e-09

UDP:flavonoid glycosyltransferase YjiC, YdhE family [Carbohydrate transport and metabolism];


Pssm-ID: 441424 [Multi-domain]  Cd Length: 268  Bit Score: 59.10  E-value: 8.15e-09
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1731019548  351 LIIRGWAPQLLILSHwsTGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQFLNEKLVVEiLKIGVRV 416
Cdd:COG1819    171 VRVVDYVPQDALLPR--ADAVVHHGGAGTTAEALRAGVPQVVVPFGGDQPLNAARVER-LGAGLAL 233
PHA03378 PHA03378
EBNA-3B; Provisional
497-672 4.35e-07

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 55.46  E-value: 4.35e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  497 AQPSRNRNRAVAELPAAPGESSKPICAAE--RPSEQPSLARRsrveaasaRPQPTRQPAEsrrfdPTQAQPKRQSVAVAE 574
Cdd:PHA03378   674 YQPSPTGANTMLPIQWAPGTMQPPPRAPTpmRPPAAPPGRAQ--------RPAAATGRAR-----PPAAAPGRARPPAAA 740
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  575 RSRGSRTTNSRRAARNPEAEASRAKPSRVSPVASRAASPPVASRAAREIP--VPSRAASAQAEPV-LSSSSRAARTKLAP 651
Cdd:PHA03378   741 PGRARPPAAAPGRARPPAAAPGRARPPAAAPGAPTPQPPPQAPPAPQQRPrgAPTPQPPPQAGPTsMQLMPRAAPGQQGP 820
                          170       180
                   ....*....|....*....|.
gi 1731019548  652 STYILRGISSVGARALQPPSR 672
Cdd:PHA03378   821 TKQILRQLLTGGVKRGRPSLK 841
transpos_IS481 NF033577
IS481 family transposase; null
1950-2078 3.65e-05

IS481 family transposase; null


Pssm-ID: 468094 [Multi-domain]  Cd Length: 283  Bit Score: 47.97  E-value: 3.65e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1950 RQHPAGLLQplsvpgwkwesvsMDfITGLPKTL-RGYTVIWVVVDRLTKSAH--FVPGKSTYTASKwgqlYMTEIVRLHG 2026
Cdd:NF033577   124 RAHPGELWH-------------ID-IKKLGRIPdVGRLYLHTAIDDHSRFAYaeLYPDETAETAAD----FLRRAFAEHG 185
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 1731019548 2027 VPV-SIISDRDARFTSKFwKGLQLAL---GTRLDFSTAFHPQTDGQTERLNQILED 2078
Cdd:NF033577   186 IPIrRVLTDNGSEFRSRA-HGFELALaelGIEHRRTRPYHPQTNGKVERFHRTLKD 240
rve pfam00665
Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...
1963-2063 4.41e-05

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.


Pssm-ID: 459897 [Multi-domain]  Cd Length: 98  Bit Score: 44.23  E-value: 4.41e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1963 PGWKWEsvsMDFITGLPKTLRGYTVIWVVVDRLTK--SAHFVPGKSTYTAskWGQLYMTEIVRLHGVPVSIISDRDARFT 2040
Cdd:pfam00665    1 PNQLWQ---GDFTYIRIPGGGGKLYLLVIVDDFSReiLAWALSSEMDAEL--VLDALERAIAFRGGVPLIIHSDNGSEYT 75
                           90       100
                   ....*....|....*....|...
gi 1731019548 2041 SKFWKGLQLALGTRLDFSTAFHP 2063
Cdd:pfam00665   76 SKAFREFLKDLGIKPSFSRPGNP 98
RnhA COG0328
Ribonuclease HI [Replication, recombination and repair];
1686-1800 4.79e-05

Ribonuclease HI [Replication, recombination and repair];


Pssm-ID: 440097 [Multi-domain]  Cd Length: 136  Bit Score: 45.22  E-value: 4.79e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1686 VIYSDAS------KKGLGCVLMQQGKVvayasRQLKIHEQNYPTHDLELAAVVFALKIWRHyLYGEKIQIYTDHKSLKYF 1759
Cdd:COG0328      4 EIYTDGAcrgnpgPGGWGAVIRYGGEE-----KELSGGLGDTTNNRAELTALIAALEALKE-LGPCEVEIYTDSQYVVNQ 77
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 1731019548 1760 FTQKELNMRQRRW------------LELVKDYDCEILYHPGKA----NVVADALSRK 1800
Cdd:COG0328     78 ITGWIHGWKKNGWkpvknpdlwqrlDELLARHKVTFEWVKGHAghpgNERADALANK 134
CD_POL_like cd18979
chromodomain of a Zea maize putative metaviridae (gypsy-type) retrotransposon polyproteins ...
2268-2315 3.58e-03

chromodomain of a Zea maize putative metaviridae (gypsy-type) retrotransposon polyproteins (Z195D10.9), and similar proteins; This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in Zea maize Z195D10.9 protein, and other putative TY3/gypsy retrotransposon polyproteins. The pol gene in TY3/gypsy elements generally encodes domains in the following order: an aspartyl protease, a reverse transcriptase, RNase H, and an integrase, here the chromodomain is found at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.


Pssm-ID: 349335  Cd Length: 48  Bit Score: 37.47  E-value: 3.58e-03
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*...
gi 1731019548 2268 QPVEVLAREVKKLRSREIpLVKilWQNHGVEEATWEKEEDMRAQYPEL 2315
Cdd:cd18979      2 FPEKVLDIRQRDKGNKEF-LVQ--WQGLSVEEATWEPYKDLVQQFPDF 46
growth_prot_Scy NF041483
polarized growth protein Scy;
505-650 4.22e-03

polarized growth protein Scy;


Pssm-ID: 469371 [Multi-domain]  Cd Length: 1293  Bit Score: 42.51  E-value: 4.22e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  505 RAVAE--LPAAPGEsskpicaAERPSEQPSlaRRSRVEAASARPQPT----RQPAESRRF---DPTQAQpkrQSVAVAER 575
Cdd:NF041483   173 RAEAEqaLAAARAE-------AERLAEEAR--QRLGSEAESARAEAEailrRARKDAERLlnaASTQAQ---EATDHAEQ 240
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1731019548  576 SRGSRTTNSRrAARNPEAEASRAKPSRVSPvasraasppvASRAAREipvpsraASAQAEPVLSSSSRAARTKLA 650
Cdd:NF041483   241 LRSSTAAESD-QARRQAAELSRAAEQRMQE----------AEEALRE-------ARAEAEKVVAEAKEAAAKQLA 297
SepH NF040712
septation protein SepH; Septation protein H (SepH) was firstly characterized in Streptomyces ...
496-627 9.18e-03

septation protein SepH; Septation protein H (SepH) was firstly characterized in Streptomyces venezuelae, and homologs were identified in Mycobacterium smegmatis. SepH contains a N-terminal DUF3071 domain and a conserved C-terminal region. It binds directly to cell division protein FtsZ to stimulate the assembly of FtsZ protofilaments.


Pssm-ID: 468676 [Multi-domain]  Cd Length: 346  Bit Score: 40.91  E-value: 9.18e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  496 AAQPSRNRNRAVAELPAAPGEsskpicAAERPSEQPSLARRSRVEAASARPQPTRQPAESRRFDPTQAQPKRQSVAVAER 575
Cdd:NF040712   206 AREPADARPEEVEPAPAAEGA------PATDSDPAEAGTPDDLASARRRRAGVEQPEDEPVGPGAAPAAEPDEATRDAGE 279
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 1731019548  576 SRGSRTTNSRRAARNPE----AEASRAKPSRVSPVASRAASPPVASRAAREIPVPS 627
Cdd:NF040712   280 PPAPGAAETPEAAEPPApapaAPAAPAAPEAEEPARPEPPPAPKPKRRRRRASVPS 335
 
Name Accession Description Interval E-value
PLN02534 PLN02534
UDP-glycosyltransferase
1-487 0e+00

UDP-glycosyltransferase


Pssm-ID: 215293 [Multi-domain]  Cd Length: 491  Bit Score: 723.19  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548    1 MASTLSNQLelqpHFVLVPLMAQGHMIPMIDIATLLARRGVFVTFVTTPYNATRLESFFARAKQSSLSISLLEIPFPCLQ 80
Cdd:PLN02534     1 KAVSKAKQL----HFVLIPLMAQGHMIPMIDMARLLAERGVIVSLVTTPQNASRFAKTIDRARESGLPIRLVQIPFPCKE 76
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   81 VGLPLGCENLDTLPSRSLLRNFYKALSLLQQPLEQFLSRHHLNPTCIISDKYLYWTAQTAHKFKCPRVVFHGTGCFSLLS 160
Cdd:PLN02534    77 VGLPIGCENLDTLPSRDLLRKFYDAVDKLQQPLERFLEQAKPPPSCIISDKCLSWTSKTAQRFNIPRIVFHGMCCFSLLS 156
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  161 SHNLQLYSPHTSIDSNSQPFLVPGLPHKIEITKSQLPGSLIKSPDFDDFRDKITKAEQEAYGVVVNSFSELENGYYQNYE 240
Cdd:PLN02534   157 SHNIRLHNAHLSVSSDSEPFVVPGMPQSIEITRAQLPGAFVSLPDLDDVRNKMREAESTAFGVVVNSFNELEHGCAEAYE 236
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  241 RAISKKLWCIGPVSLCNENSIEKYNRGNKASIEQSNCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLGQCLESSTRPFI 320
Cdd:PLN02534   237 KAIKKKVWCVGPVSLCNKRNLDKFERGNKASIDETQCLEWLDSMKPRSVIYACLGSLCRLVPSQLIELGLGLEASKKPFI 316
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  321 WVIKNrDENCSELEKWLSEEEFERKTKGRGLIIRGWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQF 400
Cdd:PLN02534   317 WVIKT-GEKHSELEEWLVKENFEERIKGRGLLIKGWAPQVLILSHPAIGGFLTHCGWNSTIEGICSGVPMITWPLFAEQF 395
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  401 LNEKLVVEILKIGVRVGVEGAVRWGEEERVGVMAKKEEIEKAIEMVMDG-GEEGEERRRRVGDLSKMAPKAMENGGSSYV 479
Cdd:PLN02534   396 LNEKLIVEVLRIGVRVGVEVPVRWGDEERVGVLVKKDEVEKAVKTLMDDgGEEGERRRRRAQELGVMARKAMELGGSSHI 475

                   ....*...
gi 1731019548  480 NLSLFIED 487
Cdd:PLN02534   476 NLSILIQD 483
PLN03007 PLN03007
UDP-glucosyltransferase family protein
14-487 6.34e-137

UDP-glucosyltransferase family protein


Pssm-ID: 178584 [Multi-domain]  Cd Length: 482  Bit Score: 437.75  E-value: 6.34e-137
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   14 HFVLVPLMAQGHMIPMIDIATLLARRGVFVTFVTTPYNATRLESFFARAKQ--SSLSISLLEIPFPCLQVGLPLGCENLD 91
Cdd:PLN03007     7 HILFFPFMAHGHMIPTLDMAKLFSSRGAKSTILTTPLNAKIFEKPIEAFKNlnPGLEIDIQIFNFPCVELGLPEGCENVD 86
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   92 TLPSRS------LLRNFYKALSLLQQPLEQFLSRhhLNPTCIISDKYLYWTAQTAHKFKCPRVVFHGTGCFSLLSSHNLQ 165
Cdd:PLN03007    87 FITSNNnddsgdLFLKFLFSTKYFKDQLEKLLET--TRPDCLVADMFFPWATEAAEKFGVPRLVFHGTGYFSLCASYCIR 164
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  166 LYSPHTSIDSNSQPFLVPGLPHKIEITKSQLPGSLIKSPdFDDFRDKITKAEQEAYGVVVNSFSELENGYYQNYERAISK 245
Cdd:PLN03007   165 VHKPQKKVASSSEPFVIPDLPGDIVITEEQINDADEESP-MGKFMKEVRESEVKSFGVLVNSFYELESAYADFYKSFVAK 243
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  246 KLWCIGPVSLCNENSIEKYNRGNKASIEQSNCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLGQCLESSTRPFIWVIkN 325
Cdd:PLN03007   244 RAWHIGPLSLYNRGFEEKAERGKKANIDEQECLKWLDSKKPDSVIYLSFGSVASFKNEQLFEIAAGLEGSGQNFIWVV-R 322
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  326 RDENCSELEKWLSEEeFERKTKGRGLIIRGWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQFLNEKL 405
Cdd:PLN03007   323 KNENQGEKEEWLPEG-FEERTKGKGLIIRGWAPQVLILDHQATGGFVTHCGWNSLLEGVAAGLPMVTWPVGAEQFYNEKL 401
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  406 VVEILKIGVRVGVEGAVRwgeeeRVGVMAKKEEIEKAIEMVMdGGEEGEERRRRVGDLSKMAPKAMENGGSSYVNLSLFI 485
Cdd:PLN03007   402 VTQVLRTGVSVGAKKLVK-----VKGDFISREKVEKAVREVI-VGEEAEERRLRAKKLAEMAKAAVEEGGSSFNDLNKFM 475

                   ..
gi 1731019548  486 ED 487
Cdd:PLN03007   476 EE 477
RT_LTR cd01647
RT_LTR: Reverse transcriptases (RTs) from retrotransposons and retroviruses which have long ...
1416-1592 2.07e-94

RT_LTR: Reverse transcriptases (RTs) from retrotransposons and retroviruses which have long terminal repeats (LTRs) in their DNA copies but not in their RNA template. RT catalyzes DNA replication from an RNA template, and is responsible for the replication of retroelements. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs are present in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and Caulimoviruses.


Pssm-ID: 238825  Cd Length: 177  Bit Score: 302.98  E-value: 2.07e-94
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1416 GFIRPSVSPWGAPVLFVKKKDGSMRLCIDYRELNKVTVKNRYPLPRIDDLFDQLQGATVFSKIDLRSGYHQLRIRDGDIP 1495
Cdd:cd01647      1 GIIEPSSSPYASPVVVVKKKDGKLRLCVDYRKLNKVTIKDRYPLPTIDELLEELAGAKVFSKLDLRSGYHQIPLAEESRP 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1496 KTAFRSRYGHYEFVVMSFGLTNAPAVFMDLMNRVFKEFLDSFVIVFIDDILIYSKTEAEHEEHLHQVLETLRANKLYAKF 1575
Cdd:cd01647     81 KTAFRTPFGLYEYTRMPFGLKNAPATFQRLMNKILGDLLGDFVEVYLDDILVYSKTEEEHLEHLREVLERLREAGLKLNP 160
                          170
                   ....*....|....*..
gi 1731019548 1576 SKCEFWLRKVTFLGHVV 1592
Cdd:cd01647    161 EKCEFGVPEVEFLGHIV 177
PLN02863 PLN02863
UDP-glucoronosyl/UDP-glucosyl transferase family protein
7-486 4.10e-79

UDP-glucoronosyl/UDP-glucosyl transferase family protein


Pssm-ID: 215465  Cd Length: 477  Bit Score: 270.97  E-value: 4.10e-79
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548    7 NQLELQPHFVLVPLMAQGHMIPMIDIATLLARRGVFVTFVTTPYNATRLESFFARakqsSLSISLLEIPFPClQVGLPLG 86
Cdd:PLN02863     4 LNKPAGTHVLVFPFPAQGHMIPLLDLTHRLALRGLTITVLVTPKNLPFLNPLLSK----HPSIETLVLPFPS-HPSIPSG 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   87 CENLDTLPSrSLLRNFYKALSLLQQPLEQFLSRHHLNPTCIISDKYLYWTAQTAHKFKCPRVVFHGTGCFSLLSSHNLQL 166
Cdd:PLN02863    79 VENVKDLPP-SGFPLMIHALGELYAPLLSWFRSHPSPPVAIISDMFLGWTQNLACQLGIRRFVFSPSGAMALSIMYSLWR 157
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  167 YSPHTSIDSNSQPFL----VPGLPHKIEITKSQLPGSLIKSPDFDDFRDKITKAEQEAYGVVVNSFSELENGYYQNYERA 242
Cdd:PLN02863   158 EMPTKINPDDQNEILsfskIPNCPKYPWWQISSLYRSYVEGDPAWEFIKDSFRANIASWGLVVNSFTELEGIYLEHLKKE 237
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  243 I-SKKLWCIGPVSLCNENSIEKYNRGNKASIEQSNCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLGQCLESSTRPFIW 321
Cdd:PLN02863   238 LgHDRVWAVGPILPLSGEKSGLMERGGPSSVSVDDVMTWLDTCEDHKVVYVCFGSQVVLTKEQMEALASGLEKSGVHFIW 317
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  322 VIKNRDEncSELEKWLSEEEFERKTKGRGLIIRGWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQFL 401
Cdd:PLN02863   318 CVKEPVN--EESDYSNIPSGFEDRVAGRGLVIRGWAPQVAILSHRAVGAFLTHCGWNSVLEGLVAGVPMLAWPMAADQFV 395
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  402 NEKLVVEILKIGVRVgVEGAvrwgeeervGVMAKKEEIEKAIemvMDGGEEGEERRRRVGDLSKMAPKAMENGGSSYVNL 481
Cdd:PLN02863   396 NASLLVDELKVAVRV-CEGA---------DTVPDSDELARVF---MESVSENQVERERAKELRRAALDAIKERGSSVKDL 462

                   ....*
gi 1731019548  482 SLFIE 486
Cdd:PLN02863   463 DGFVK 467
GT1_Gtf-like cd03784
UDP-glycosyltransferases and similar proteins; This family includes the Gtfs, a group of ...
13-448 5.51e-67

UDP-glycosyltransferases and similar proteins; This family includes the Gtfs, a group of homologous glycosyltransferases involved in the final stages of the biosynthesis of antibiotics vancomycin and related chloroeremomycin. Gtfs transfer sugar moieties from an activated NDP-sugar donor to the oxidatively cross-linked heptapeptide core of vancomycin group antibiotics. The core structure is important for the bioactivity of the antibiotics.


Pssm-ID: 340817 [Multi-domain]  Cd Length: 404  Bit Score: 233.21  E-value: 5.51e-67
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   13 PHFVLVPLMAQGHMIPMIDIATLLARRGVFVTFVTTPYNatrLESFFARAKQSSLSISLLEIPFPCLQVglplgcENLDT 92
Cdd:cd03784      1 MRILFVPFPGQGHVNPMLPLAKALAARGHEVTVATPPFN---FADLVEAAGLTFVPVGDDPDELELDSE------TNLGP 71
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   93 LPSRSLLRNFYKALSLLQQPLEQFLsRHHLNPTCIISDKYLYWTAQTAHKFKCPRVVFHgtgcfsllsshnlqlysphts 172
Cdd:cd03784     72 DSLLELLRRLLKAADELLDDLLAAL-RSSWKPDLVIADPFAYAGPLVAEELGIPSVRLF--------------------- 129
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  173 idsnSQPFLVPGLPHKIEITKSQLPGSLIKSPDFDDFRDKITKAEQEAYGVVVNSFSelengyyqNYERAISKKLWCIGP 252
Cdd:cd03784    130 ----TGPATLLSAYLHPFGVLNLLLSSLLEPELFLDPLLEVLDRLRERLGLPPFSLV--------LLLLRLVPPLYVIGP 197
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  253 VSLCNE-------NSIEKYNRGNKASIEQSNCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLG-QCLESSTRPFIWVIK 324
Cdd:cd03784    198 TFPSLPpdrprlpSVLGGLRIVPKNGPLPDELWEWLDKQPPRSVVYVSFGSMVRDLPEELLELIaEALASLGQRFLWVVG 277
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  325 NrdencselekwlSEEEFERKTKGRGLIIRgWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQFLNeK 404
Cdd:cd03784    278 P------------DPLGGLERLPDNVLVVK-WVPQDELLAHPAVGAFVTHGGWNSTLEALYAGVPMVVVPLFADQPNN-A 343
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|....
gi 1731019548  405 LVVEILKIGVRVGVEgavrwgeeervgvMAKKEEIEKAIEMVMD 448
Cdd:cd03784    344 ARVEELGAGVELDKD-------------ELTAEELAKAVREVLE 374
RNase_HI_RT_Ty3 cd09274
Ty3/Gypsy family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1686-1801 1.30e-57

Ty3/Gypsy family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1 and the vertebrate retroviruses. Ty3/Gypsy family widely distributed among the genomes of plants, fungi and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260006 [Multi-domain]  Cd Length: 121  Bit Score: 195.02  E-value: 1.30e-57
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1686 VIYSDASKKGLGCVLMQ-----QGKVVAYASRQLKIHEQNYPTHDLELAAVVFALKIWRHYLYGEKIQIYTDHKSLKYFF 1760
Cdd:cd09274      1 ILETDASDYGIGAVLSQedddgKERPIAFFSRKLTPAERNYSTTEKELLAIVWALKKFRHYLLGRPFTVYTDHKALKYLL 80
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|.
gi 1731019548 1761 TQKELNMRQRRWLELVKDYDCEILYHPGKANVVADALSRKV 1801
Cdd:cd09274     81 TQKDLNGRLARWLLLLSEFDFEIEYRPGKENVVADALSRLP 121
PLN02448 PLN02448
UDP-glycosyltransferase family protein
14-487 2.73e-51

UDP-glycosyltransferase family protein


Pssm-ID: 215247 [Multi-domain]  Cd Length: 459  Bit Score: 189.06  E-value: 2.73e-51
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   14 HFVLVPLMAQGHMIPMIDIATLLARR--GVFVTFVTTpynatrlESF--FARAKQSSLSISLLEIPfpclqvglplgceN 89
Cdd:PLN02448    12 HVVAMPYPGRGHINPMMNLCKLLASRkpDILITFVVT-------EEWlgLIGSDPKPDNIRFATIP-------------N 71
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   90 LdtLPSRsLLRN-----FYKALSL-LQQPLEQFLSRHHLNPTCIISDKYLYWTAQTAHKFKCPRVVFHGTGC--FSLLss 161
Cdd:PLN02448    72 V--IPSE-LVRAadfpgFLEAVMTkMEAPFEQLLDRLEPPVTAIVADTYLFWAVGVGNRRNIPVASLWTMSAtfFSVF-- 146
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  162 HNLQLYSPHT------SIDSNSQPFLVPGLPhkiEITKSQLPGSLIKS--PDFDDFRDKITKAEQEAYgVVVNSFSELEN 233
Cdd:PLN02448   147 YHFDLLPQNGhfpvelSESGEERVDYIPGLS---STRLSDLPPIFHGNsrRVLKRILEAFSWVPKAQY-LLFTSFYELEA 222
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  234 GYYQNYERAISKKLWCIGPV----SLCNENSIEKYNRGnkasieQSNCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLG 309
Cdd:PLN02448   223 QAIDALKSKFPFPVYPIGPSipymELKDNSSSSNNEDN------EPDYFQWLDSQPEGSVLYVSLGSFLSVSSAQMDEIA 296
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  310 QCLESSTRPFIWVIknRDEnCSELEKWLSeeeferktkGRGLIIrGWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVP 389
Cdd:PLN02448   297 AGLRDSGVRFLWVA--RGE-ASRLKEICG---------DMGLVV-PWCDQLKVLCHSSVGGFWTHCGWNSTLEAVFAGVP 363
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  390 MITWPQFAEQFLNEKLVVEILKIGVRVgvegavrwGEEERVGVMAKKEEIEKAIEMVMDGGEEGEER-RRRVGDLSKMAP 468
Cdd:PLN02448   364 MLTFPLFWDQPLNSKLIVEDWKIGWRV--------KREVGEETLVGREEIAELVKRFMDLESEEGKEmRRRAKELQEICR 435
                          490
                   ....*....|....*....
gi 1731019548  469 KAMENGGSSYVNLSLFIED 487
Cdd:PLN02448   436 GAIAKGGSSDTNLDAFIRD 454
PLN02410 PLN02410
UDP-glucoronosyl/UDP-glucosyl transferase family protein
16-443 4.36e-47

UDP-glucoronosyl/UDP-glucosyl transferase family protein


Pssm-ID: 178032 [Multi-domain]  Cd Length: 451  Bit Score: 176.77  E-value: 4.36e-47
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   16 VLVPLMAQGHMIPMIDIATLLARRGVFVTFVTTPYNATRLESFFARAKQSSLSISLLEIPFpclqvglplgcENLDTLps 95
Cdd:PLN02410    11 VLVPVPAQGHISPMMQLAKTLHLKGFSITIAQTKFNYFSPSDDFTDFQFVTIPESLPESDF-----------KNLGPI-- 77
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   96 rSLLRNFYKALSL-LQQPLEQFLSRHHLNPTCIISDKYLYWTAQTAHKFKCPRVVFHGTGC--------FSLLSSHNLQ- 165
Cdd:PLN02410    78 -EFLHKLNKECQVsFKDCLGQLVLQQGNEIACVVYDEFMYFAEAAAKEFKLPNVIFSTTSAtafvcrsvFDKLYANNVLa 156
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  166 -LYSPhtsidSNSQPFLVPGLpHKIEItkSQLPGSLIKSPD--FDDFRDKITKaeQEAYGVVVNSFSELENGYYQNYERA 242
Cdd:PLN02410   157 pLKEP-----KGQQNELVPEF-HPLRC--KDFPVSHWASLEsiMELYRNTVDK--RTASSVIINTASCLESSSLSRLQQQ 226
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  243 ISKKLWCIGPVSLCNENSIEKYNrgnkasiEQSNCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLGQCLESSTRPFIWV 322
Cdd:PLN02410   227 LQIPVYPIGPLHLVASAPTSLLE-------ENKSCIEWLNKQKKNSVIFVSLGSLALMEINEVMETASGLDSSNQQFLWV 299
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  323 IKNRDENCSELEKWLSEEeFERKTKGRGLIIRgWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQFLN 402
Cdd:PLN02410   300 IRPGSVRGSEWIESLPKE-FSKIISGRGYIVK-WAPQKEVLSHPAVGGFWSHCGWNSTLESIGEGVPMICKPFSSDQKVN 377
                          410       420       430       440       450
                   ....*....|....*....|....*....|....*....|....*....|....
gi 1731019548  403 EKLVVEILKIGVRV-------GVEGAVRW------GEEERVGVMAKKEEIEKAI 443
Cdd:PLN02410   378 ARYLECVWKIGIQVegdldrgAVERAVKRlmveeeGEEMRKRAISLKEQLRASV 431
PLN02670 PLN02670
transferase, transferring glycosyl groups
14-416 7.47e-47

transferase, transferring glycosyl groups


Pssm-ID: 178275 [Multi-domain]  Cd Length: 472  Bit Score: 176.63  E-value: 7.47e-47
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   14 HFVLVPLMAQGHMIPMIDIATLLARRGVFVTFVTTPYNATRLESFfarAKQSSLSISLLEIPFPCLQvGLPLGCENLDTL 93
Cdd:PLN02670     8 HVAMFPWLAMGHLIPFLRLSKLLAQKGHKISFISTPRNLHRLPKI---PSQLSSSITLVSFPLPSVP-GLPSSAESSTDV 83
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   94 P--SRSLLRnfyKALSLLQQPLEQFLSRHhlNPTCIISDKYLYWTAQTAHKFkcprvvfhGTGC--FSLLSSHNLQLYSP 169
Cdd:PLN02670    84 PytKQQLLK---KAFDLLEPPLTTFLETS--KPDWIIYDYASHWLPSIAAEL--------GISKafFSLFTAATLSFIGP 150
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  170 HTSI----DSNSQP--FLV--PGLPHKI-------EITKSQLPGSLIKSPDFDDFRDKITKAEQEAygVVVNSFSELENG 234
Cdd:PLN02670   151 PSSLmeggDLRSTAedFTVvpPWVPFESnivfryhEVTKYVEKTEEDETGPSDSVRFGFAIGGSDV--VIIRSSPEFEPE 228
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  235 YYQNYERAISKKLWCIGPVSLCNENSIEKYNRGNKasiEQSNCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLGQCLES 314
Cdd:PLN02670   229 WFDLLSDLYRKPIIPIGFLPPVIEDDEEDDTIDVK---GWVRIKEWLDKQRVNSVVYVALGTEASLRREEVTELALGLEK 305
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  315 STRPFIWVIKNRDENCSELEKWLsEEEFERKTKGRGLIIRGWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITWP 394
Cdd:PLN02670   306 SETPFFWVLRNEPGTTQNALEML-PDGFEERVKGRGMIHVGWVPQVKILSHESVGGFLTHCGWNSVVEGLGFGRVLILFP 384
                          410       420
                   ....*....|....*....|..
gi 1731019548  395 QFAEQFLNEKLvVEILKIGVRV 416
Cdd:PLN02670   385 VLNEQGLNTRL-LHGKKLGLEV 405
PLN02210 PLN02210
UDP-glucosyl transferase
12-489 2.49e-46

UDP-glucosyl transferase


Pssm-ID: 215127 [Multi-domain]  Cd Length: 456  Bit Score: 174.84  E-value: 2.49e-46
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   12 QPHFVLVPLMAQGHMIPMIDIA--TLLARRGVFVTFVTTpynatrlesffaraKQSSLSISLLEIP-FPCLQVGLPLGCE 88
Cdd:PLN02210     8 ETHVLMVTLAFQGHINPMLKLAkhLSLSSKNLHFTLATT--------------EQARDLLSTVEKPrRPVDLVFFSDGLP 73
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   89 NLDTLPSRSLLRNFYK--ALSLLQQPLEQFLSrhhlnptCIISDKYLYWT--AQTAHKFKCPRVVFHGTGCFSLLSSHNL 164
Cdd:PLN02210    74 KDDPRAPETLLKSLNKvgAKNLSKIIEEKRYS-------CIISSPFTPWVpaVAAAHNIPCAILWIQACGAYSVYYRYYM 146
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  165 QLySPHTSIDSNSQPFLVPGLPhKIEItkSQLPGSLIKS--PDFDDFRDKITKAEQEAYGVVVNSFSELENGYYQNYerA 242
Cdd:PLN02210   147 KT-NSFPDLEDLNQTVELPALP-LLEV--RDLPSFMLPSggAHFNNLMAEFADCLRYVKWVLVNSFYELESEIIESM--A 220
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  243 ISKKLWCIGP-VS---LCNENsiEKYNRGNKASIEQSN--CLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLGQCLESST 316
Cdd:PLN02210   221 DLKPVIPIGPlVSpflLGDDE--EETLDGKNLDMCKSDdcCMEWLDKQARSSVVYISFGSMLESLENQVETIAKALKNRG 298
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  317 RPFIWVI--KNRDENCSELEKWLSEeeferktkGRGLIIRgWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITWP 394
Cdd:PLN02210   299 VPFLWVIrpKEKAQNVQVLQEMVKE--------GQGVVLE-WSPQEKILSHMAISCFVTHCGWNSTIETVVAGVPVVAYP 369
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  395 QFAEQFLNEKLVVEILKIGVRV---GVEGAVrwgeeervgvmaKKEEIEKAIEMVMDgGEEGEERRRRVGDLSKMAPKAM 471
Cdd:PLN02210   370 SWTDQPIDARLLVDVFGIGVRMrndAVDGEL------------KVEEVERCIEAVTE-GPAAADIRRRAAELKHVARLAL 436
                          490
                   ....*....|....*...
gi 1731019548  472 ENGGSSYVNLSLFIEDPP 489
Cdd:PLN02210   437 APGGSSARNLDLFISDIT 454
PLN00164 PLN00164
glucosyltransferase; Provisional
13-501 5.47e-46

glucosyltransferase; Provisional


Pssm-ID: 215084 [Multi-domain]  Cd Length: 480  Bit Score: 174.09  E-value: 5.47e-46
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   13 PHFVLVPLMAQGHMIPMIDIAT-LLARRG-------VFVTFVTTPYNATRLESFFARAKQSSLSISLLEIPfpclQVGLP 84
Cdd:PLN00164     4 PTVVLLPVWGSGHLMSMLEAGKrLLASSGggalsltVLVMPPPTPESASEVAAHVRREAASGLDIRFHHLP----AVEPP 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   85 LGCENLDTLPSRSLLR---NFYKALSLLQQPLeqflsrhhlnpTCIISDKYLYWTAQTAHKFKCPRVVFHGTGCfSLLSs 161
Cdd:PLN00164    80 TDAAGVEEFISRYIQLhapHVRAAIAGLSCPV-----------AALVVDFFCTPLLDVARELAVPAYVYFTSTA-AMLA- 146
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  162 hnLQLYSP--HTSI----DSNSQPFLVPGLPhkiEITKSQLPGSLI--KSPDFDDFR---DKITkaeqEAYGVVVNSFSE 230
Cdd:PLN00164   147 --LMLRLPalDEEVavefEEMEGAVDVPGLP---PVPASSLPAPVMdkKSPNYAWFVyhgRRFM----EAAGIIVNTAAE 217
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  231 LENGYYQnyerAISK----------KLWCIGPVslcnensiekYNRGNKASIEQSN--CLNWLDSMIPKSVLYICLGSLC 298
Cdd:PLN00164   218 LEPGVLA----AIADgrctpgrpapTVYPIGPV----------ISLAFTPPAEQPPheCVRWLDAQPPASVVFLCFGSMG 283
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  299 RMLPSQLIQLGQCLESSTRPFIWVIKNR---------DENCSELekwLSEEEFERkTKGRGLIIRGWAPQLLILSHWSTG 369
Cdd:PLN00164   284 FFDAPQVREIAAGLERSGHRFLWVLRGPpaagsrhptDADLDEL---LPEGFLER-TKGRGLVWPTWAPQKEILAHAAVG 359
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  370 GFLTHCGWNSTVEGIGNGVPMITWPQFAEQFLNEKLVVEILKIGVRVGV-------------EGAVR--WGEEERVGVMA 434
Cdd:PLN00164   360 GFVTHCGWNSVLESLWHGVPMAPWPLYAEQHLNAFELVADMGVAVAMKVdrkrdnfveaaelERAVRslMGGGEEEGRKA 439
                          490       500       510       520       530       540
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1731019548  435 KkeeiEKAIEMvmdggeegeerrrrvgdlsKMA-PKAMENGGSSYVNLSLFIEDppsiLLCSAAQPSR 501
Cdd:PLN00164   440 R----EKAAEM-------------------KAAcRKAVEEGGSSYAALQRLARE----IRHGAVAPTR 480
PLN02173 PLN02173
UDP-glucosyl transferase family protein
10-485 2.25e-42

UDP-glucosyl transferase family protein


Pssm-ID: 177830 [Multi-domain]  Cd Length: 449  Bit Score: 162.89  E-value: 2.25e-42
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   10 ELQPHFVLVPLMAQGHMIPMIDIATLLARRGVFVTFVTTPY--NATRLESffarakqsSLSISLLEIPFPCLQVGLplgc 87
Cdd:PLN02173     3 KMRGHVLAVPFPSQGHITPIRQFCKRLHSKGFKTTHTLTTFifNTIHLDP--------SSPISIATISDGYDQGGF---- 70
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   88 ENLDTLPSrsLLRNFYKALSllqQPLEQFLSRHHL--NP-TCIISDKYLYWTAQTAHKFKCPRVVFHGTGC-------FS 157
Cdd:PLN02173    71 SSAGSVPE--YLQNFKTFGS---KTVADIIRKHQStdNPiTCIVYDSFMPWALDLAREFGLAAAPFFTQSCavnyinyLS 145
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  158 LLSSHNLQLysPHTSIdsnsqPFL-VPGLPHKIEITKSQLPGSLIKSPDFDDFrdkitkaeQEAYGVVVNSFSELEngYY 236
Cdd:PLN02173   146 YINNGSLTL--PIKDL-----PLLeLQDLPTFVTPTGSHLAYFEMVLQQFTNF--------DKADFVLVNSFHDLD--LH 208
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  237 QNYERAISKKLWCIGPV--SLCNENSIEK---YNRGNKASIEQSNCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLGQC 311
Cdd:PLN02173   209 ENELLSKVCPVLTIGPTvpSMYLDQQIKSdndYDLNLFDLKEAALCTDWLDKRPQGSVVYIAFGSMAKLSSEQMEEIASA 288
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  312 LesSTRPFIWVIKNrdencSELEKwLSEEEFERKTKGRGLIIRgWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMI 391
Cdd:PLN02173   289 I--SNFSYLWVVRA-----SEESK-LPPGFLETVDKDKSLVLK-WSPQLQVLSNKAIGCFMTHCGWNSTMEGLSLGVPMV 359
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  392 TWPQFAEQFLNEKLVVEILKIGVRVgvegavrwgEEERVGVMAKKEEIEKAIEMVMDGGEEGEERRRRVGdLSKMAPKAM 471
Cdd:PLN02173   360 AMPQWTDQPMNAKYIQDVWKVGVRV---------KAEKESGIAKREEIEFSIKEVMEGEKSKEMKENAGK-WRDLAVKSL 429
                          490
                   ....*....|....
gi 1731019548  472 ENGGSSYVNLSLFI 485
Cdd:PLN02173   430 SEGGSTDININTFV 443
PLN02167 PLN02167
UDP-glycosyltransferase family protein
16-487 4.35e-42

UDP-glycosyltransferase family protein


Pssm-ID: 215112 [Multi-domain]  Cd Length: 475  Bit Score: 162.66  E-value: 4.35e-42
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   16 VLVPLMAQGHMIPMIDIATLLARRGVFVTFVT-----TPYnATRLESFFARAKQSSLSISLLEIPfpclQVGLPLGCENL 90
Cdd:PLN02167     7 IFVPFPSTGHILVTIEFAKRLINLDRRIHTITilywsLPF-APQADAFLKSLIASEPRIRLVTLP----EVQDPPPMELF 81
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   91 DTLPSRSLLRNFYKALSLLQQPLEQFLSRHHLNPTC----IISDKYLYWTAQTAHKFKCPRVVFHgTGCFSLLS-----S 161
Cdd:PLN02167    82 VKASEAYILEFVKKMVPLVRDALSTLVSSRDESDSVrvagLVLDFFCVPLIDVGNEFNLPSYIFL-TCNAGFLGmmkylP 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  162 HNLQLYSPHTSIDSNSQPFLVPGLPHKIEiTKSQLPGSLIKspdfDDFRDKITKAEQ--EAYGVVVNSFSELENGYYQNY 239
Cdd:PLN02167   161 ERHRKTASEFDLSSGEEELPIPGFVNSVP-TKVLPPGLFMK----ESYEAWVEIAERfpEAKGILVNSFTELEPNAFDYF 235
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  240 ERAISK--KLWCIGPVsLCNENSIEKynrgNKASIEQSNCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLGQCLESSTR 317
Cdd:PLN02167   236 SRLPENypPVYPVGPI-LSLKDRTSP----NLDSSDRDRIMRWLDDQPESSVVFLCFGSLGSLPAPQIKEIAQALELVGC 310
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  318 PFIWVIK-NRDENCSELEkwLSEEEFERKTKGRGLIIrGWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITWPQF 396
Cdd:PLN02167   311 RFLWSIRtNPAEYASPYE--PLPEGFMDRVMGRGLVC-GWAPQVEILAHKAIGGFVSHCGWNSVLESLWFGVPIATWPMY 387
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  397 AEQFLNEKLVVEILKIGVRVGVEGAVRWGEeervgvMAKKEEIEKAIEMVMDggeEGEERRRRVGDLSKMAPKAMENGGS 476
Cdd:PLN02167   388 AEQQLNAFTMVKELGLAVELRLDYVSAYGE------IVKADEIAGAVRSLMD---GEDVPRKKVKEIAEAARKAVMDGGS 458
                          490
                   ....*....|.
gi 1731019548  477 SYVNLSLFIED 487
Cdd:PLN02167   459 SFVAVKRFIDD 469
PLN03004 PLN03004
UDP-glycosyltransferase
16-447 1.47e-41

UDP-glycosyltransferase


Pssm-ID: 178581  Cd Length: 451  Bit Score: 160.63  E-value: 1.47e-41
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   16 VLVPLMAQGHMIPMIDIATLLARR----GVFVTFVTTPYNATRLESFFarakqSSLSISLLEIPFPCLQVGLPLGCENLD 91
Cdd:PLN03004     7 VLYPAPPIGHLVSMVELGKTILSKnpslSIHIILVPPPYQPESTATYI-----SSVSSSFPSITFHHLPAVTPYSSSSTS 81
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   92 TLPSRSLLRNFYkALSLLQQPLEQFLSRHHLNPTCIISDKYLYWTAQTAHKFKCPRVVFHGTGCFSLLSShnlqLYSPht 171
Cdd:PLN03004    82 RHHHESLLLEIL-CFSNPSVHRTLFSLSRNFNVRAMIIDFFCTAVLDITADFTFPVYFFYTSGAACLAFS----FYLP-- 154
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  172 SIDSNSQ-------PFL-VPGLPhkiEITKSQLPGSLIKSPD-----FDDFRDKITKAEqeayGVVVNSFSELENGYYQN 238
Cdd:PLN03004   155 TIDETTPgknlkdiPTVhIPGVP---PMKGSDMPKAVLERDDevydvFIMFGKQLSKSS----GIIINTFDALENRAIKA 227
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  239 Y-ERAISKKLWCIGPvsLCNENSIEKYNRGNKASieqsnCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLGQCLESSTR 317
Cdd:PLN03004   228 ItEELCFRNIYPIGP--LIVNGRIEDRNDNKAVS-----CLNWLDSQPEKSVVFLCFGSLGLFSKEQVIEIAVGLEKSGQ 300
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  318 PFIWVIKNRDE-NCSELE-KWLSEEEFERKTKGRGLIIRGWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITWPQ 395
Cdd:PLN03004   301 RFLWVVRNPPElEKTELDlKSLLPEGFLSRTEDKGMVVKSWAPQVPVLNHKAVGGFVTHCGWNSILEAVCAGVPMVAWPL 380
                          410       420       430       440       450
                   ....*....|....*....|....*....|....*....|....*....|..
gi 1731019548  396 FAEQFLNEKLVVEILKIGVRVgvegavrwgEEERVGVMAKKeEIEKAIEMVM 447
Cdd:PLN03004   381 YAEQRFNRVMIVDEIKIAISM---------NESETGFVSST-EVEKRVQEII 422
RT_RNaseH pfam17917
RNase H-like domain found in reverse transcriptase; DNA polymerase and ribonuclease H (RNase H) ...
1685-1775 2.42e-41

RNase H-like domain found in reverse transcriptase; DNA polymerase and ribonuclease H (RNase H) activities allow reverse transcriptases to convert the single-stranded retroviral RNA genome into double-stranded DNA, which is integrated into the host chromosome during infection. This entry represents the RNase H like domain.


Pssm-ID: 465565  Cd Length: 104  Bit Score: 148.04  E-value: 2.42e-41
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1685 FVIYSDASKKGLGCVLMQQG-----KVVAYASRQLKIHEQNYPTHDLELAAVVFALKIWRHYLYGEKIQIYTDHKSLKYF 1759
Cdd:pfam17917    6 FILETDASDYGIGAVLSQKDedgkeRPIAYASRKLTPAERNYSTTEKELLAIVWALKKFRHYLLGRKFTVYTDHKPLKYL 85
                           90
                   ....*....|....*.
gi 1731019548 1760 FTQKELNMRQRRWLEL 1775
Cdd:pfam17917   86 FTPKELNGRLARWALF 101
PLN02555 PLN02555
limonoid glucosyltransferase
13-487 5.76e-41

limonoid glucosyltransferase


Pssm-ID: 178170 [Multi-domain]  Cd Length: 480  Bit Score: 159.58  E-value: 5.76e-41
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   13 PHFVLVPLMAQGHMIPMIDIATLLARRGVFVTFVTTpynatrlESFFARAKQSSLSISLLEIP-------FPCLQVGLPl 85
Cdd:PLN02555     8 VHVMLVSFPGQGHVNPLLRLGKLLASKGLLVTFVTT-------ESWGKKMRQANKIQDGVLKPvgdgfirFEFFEDGWA- 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   86 gcenlDTLPSRSLLRNFYKALSLL-QQPLEQFLSRH--HLNP-TCIISDKYLYWTAQTAHKFKCPRVVF--HGTGCFSLL 159
Cdd:PLN02555    80 -----EDDPRRQDLDLYLPQLELVgKREIPNLVKRYaeQGRPvSCLINNPFIPWVCDVAEELGIPSAVLwvQSCACFSAY 154
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  160 S--SHNLQLYSPHTSIDSNSQPFLVPGLPHkieitkSQLPGSLIKSPDFDDFRDKIT---KAEQEAYGVVVNSFSELEng 234
Cdd:PLN02555   155 YhyYHGLVPFPTETEPEIDVQLPCMPLLKY------DEIPSFLHPSSPYPFLRRAILgqyKNLDKPFCILIDTFQELE-- 226
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  235 yyQNYERAISKkLWCIGPVS--LCNENSIEKYNRGNkASIEQSNCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLGQCL 312
Cdd:PLN02555   227 --KEIIDYMSK-LCPIKPVGplFKMAKTPNSDVKGD-ISKPADDCIEWLDSKPPSSVVYISFGTVVYLKQEQIDEIAYGV 302
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  313 ESSTRPFIWVIKNRDENcSELEKWLSEEEFERKTKGRGLIIRgWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMIT 392
Cdd:PLN02555   303 LNSGVSFLWVMRPPHKD-SGVEPHVLPEEFLEKAGDKGKIVQ-WCPQEKVLAHPSVACFVTHCGWNSTMEALSSGVPVVC 380
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  393 WPQFAEQFLNEKLVVEILKIGVRVGvegavRWGEEERVgvmAKKEEI----------EKAIEMVMDGGEegeerrrrvgd 462
Cdd:PLN02555   381 FPQWGDQVTDAVYLVDVFKTGVRLC-----RGEAENKL---ITREEVaeclleatvgEKAAELKQNALK----------- 441
                          490       500
                   ....*....|....*....|....*
gi 1731019548  463 LSKMAPKAMENGGSSYVNLSLFIED 487
Cdd:PLN02555   442 WKEEAEAAVAEGGSSDRNFQEFVDK 466
PLN02992 PLN02992
coniferyl-alcohol glucosyltransferase
12-448 5.49e-40

coniferyl-alcohol glucosyltransferase


Pssm-ID: 178572 [Multi-domain]  Cd Length: 481  Bit Score: 156.68  E-value: 5.49e-40
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   12 QPHFVLVPLMAQGHMIPMIDIAT-LLARRGVFVTFVTTPYNATRLESFFArakqSSLSISLLEIPFPCL--------QVG 82
Cdd:PLN02992     5 KPHAAMFSSPGMGHVIPVIELGKrLSANHGFHVTVFVLETDAASAQSKFL----NSTGVDIVGLPSPDIsglvdpsaHVV 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   83 LPLGCENLDTLPSrslLRNFYKALsllqqpleqflsrhHLNPTCIISDKYLYWTAQTAHKFKCPRVVFHGTGCFSLlssh 162
Cdd:PLN02992    81 TKIGVIMREAVPT---LRSKIAEM--------------HQKPTALIVDLFGTDALCLGGEFNMLTYIFIASNARFL---- 139
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  163 NLQLYSPHTSIDSNS------QPFLVPGL-PHKIEITksqLPGSLIksPDFDDFRDKITK--AEQEAYGVVVNSFSELEN 233
Cdd:PLN02992   140 GVSIYYPTLDKDIKEehtvqrKPLAMPGCePVRFEDT---LDAYLV--PDEPVYRDFVRHglAYPKADGILVNTWEEMEP 214
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  234 GYYQNYE------RAISKKLWCIGPvsLCnensiekynRGNKASIEQSNCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQ 307
Cdd:PLN02992   215 KSLKSLQdpkllgRVARVPVYPIGP--LC---------RPIQSSKTDHPVLDWLNKQPNESVLYISFGSGGSLSAKQLTE 283
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  308 LGQCLESSTRPFIWVIKN--RDENCSElekWLSE--------------EEFERKTKGRGLIIRGWAPQLLILSHWSTGGF 371
Cdd:PLN02992   284 LAWGLEMSQQRFVWVVRPpvDGSACSA---YFSAnggetrdntpeylpEGFVSRTHDRGFVVPSWAPQAEILAHQAVGGF 360
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  372 LTHCGWNSTVEGIGNGVPMITWPQFAEQFLNEKLVVEILKIGVRV-GVEGAV-RWGEEERV-GVMAKKE--EIEKAIEMV 446
Cdd:PLN02992   361 LTHCGWSSTLESVVGGVPMIAWPLFAEQNMNAALLSDELGIAVRSdDPKEVIsRSKIEALVrKVMVEEEgeEMRRKVKKL 440

                   ..
gi 1731019548  447 MD 448
Cdd:PLN02992   441 RD 442
PLN02152 PLN02152
indole-3-acetate beta-glucosyltransferase
12-487 8.38e-40

indole-3-acetate beta-glucosyltransferase


Pssm-ID: 177813 [Multi-domain]  Cd Length: 455  Bit Score: 155.21  E-value: 8.38e-40
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   12 QPHFVLVPLMAQGHMIPMIDIAT-LLARRGVFVTFVTTPynatrleSFFARAKQSSLSiSLLEIPFPCLQVGLPLG-CEN 89
Cdd:PLN02152     3 PPHFLLVTFPAQGHVNPSLRFARrLIKTTGTRVTFATCL-------SVIHRSMIPNHN-NVENLSFLTFSDGFDDGvISN 74
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   90 LDTLPSRS--LLRNFYKALSllqqpleQFL--SRHHLNP-TCIISDKYLYWTAQTAHKFKCPRVVFHGTGCFSLlsshnl 164
Cdd:PLN02152    75 TDDVQNRLvnFERNGDKALS-------DFIeaNLNGDSPvTCLIYTILPNWAPKVARRFHLPSVLLWIQPAFVF------ 141
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  165 QLYSPHTSidSNSQPFLVPGLPH-KIEITKSQLPGSLIKSPDFDDFRDKITKAEQEAY-GVVVNSFSELENGYYQNYERA 242
Cdd:PLN02152   142 DIYYNYST--GNNSVFEFPNLPSlEIRDLPSFLSPSNTNKAAQAVYQELMEFLKEESNpKILVNTFDSLEPEFLTAIPNI 219
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  243 iskKLWCIGPVsLCNENSIEKYNRGNKASIEQSNCLN-WLDSMIPKSVLYICLGSLCRMLPSQLIQLGQCLESSTRPFIW 321
Cdd:PLN02152   220 ---EMVAVGPL-LPAEIFTGSESGKDLSVRDQSSSYTlWLDSKTESSVIYVSFGTMVELSKKQIEELARALIEGKRPFLW 295
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  322 VIKNRDENCSELEkwlSEEE--------FERKTKGRGLIIrGWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITW 393
Cdd:PLN02152   296 VITDKLNREAKIE---GEEEteiekiagFRHELEEVGMIV-SWCSQIEVLRHRAVGCFVTHCGWSSSLESLVLGVPVVAF 371
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  394 PQFAEQFLNEKLVVEILKIGVRVgvegavrwgEEERVGvMAKKEEIEKAIEMVMDGGEEGEERRRRVgdLSKMAPKAMEN 473
Cdd:PLN02152   372 PMWSDQPANAKLLEEIWKTGVRV---------RENSEG-LVERGEIRRCLEAVMEEKSVELRESAEK--WKRLAIEAGGE 439
                          490
                   ....*....|....
gi 1731019548  474 GGSSYVNLSLFIED 487
Cdd:PLN02152   440 GGSSDKNVEAFVKT 453
PLN02207 PLN02207
UDP-glycosyltransferase
15-487 3.76e-39

UDP-glycosyltransferase


Pssm-ID: 177857 [Multi-domain]  Cd Length: 468  Bit Score: 153.65  E-value: 3.76e-39
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   15 FVLVPLMAQGHMIPMIDIATLLARRG--VFVTFVTTPYNA-TRLESFFARAKQSSLSISLLEIPFpcLQVGLPLGcenlD 91
Cdd:PLN02207     6 LIFIPTPTVGHLVPFLEFARRLIEQDdrIRITILLMKLQGqSHLDTYVKSIASSQPFVRFIDVPE--LEEKPTLG----G 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   92 TLPSRSLLRNFY-KALSLLQQPLEQFLSRHHLNPTCI---ISDKYLYWTAQTAHKFKCPRVVFHGTGC-FSLLSSHNLQL 166
Cdd:PLN02207    80 TQSVEAYVYDVIeKNIPLVRNIVMDILSSLALDGVKVkgfVADFFCLPMIDVAKDVSLPFYVFLTTNSgFLAMMQYLADR 159
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  167 YSPHTSI-DSNSQPFL-VPGLPHKIeiTKSQLPGSLIKSPDFDDFRdKITKAEQEAYGVVVNSFSELE----NGYY--QN 238
Cdd:PLN02207   160 HSKDTSVfVRNSEEMLsIPGFVNPV--PANVLPSALFVEDGYDAYV-KLAILFTKANGILVNSSFDIEpysvNHFLdeQN 236
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  239 YeraisKKLWCIGPVslcnensiekYNRGNKASIEQS-----NCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLGQCLE 313
Cdd:PLN02207   237 Y-----PSVYAVGPI----------FDLKAQPHPEQDlarrdELMKWLDDQPEASVVFLCFGSMGRLRGPLVKEIAHGLE 301
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  314 SSTRPFIWVIKNRDENCSELekwlSEEEFERKTKGRGLIIrGWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITW 393
Cdd:PLN02207   302 LCQYRFLWSLRTEEVTNDDL----LPEGFLDRVSGRGMIC-GWSPQVEILAHKAVGGFVSHCGWNSIVESLWFGVPIVTW 376
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  394 PQFAEQFLNEKLVVEILKIGVRVGVEGAVRWGEeervgvMAKKEEIEKAIEMVMDggEEGEERRRRVGDLSKMAPKAMEN 473
Cdd:PLN02207   377 PMYAEQQLNAFLMVKELKLAVELKLDYRVHSDE------IVNANEIETAIRCVMN--KDNNVVRKRVMDISQMIQRATKN 448
                          490
                   ....*....|....
gi 1731019548  474 GGSSYVNLSLFIED 487
Cdd:PLN02207   449 GGSSFAAIEKFIHD 462
RT_RNaseH_2 pfam17919
RNase H-like domain found in reverse transcriptase;
1655-1749 3.93e-38

RNase H-like domain found in reverse transcriptase;


Pssm-ID: 465567 [Multi-domain]  Cd Length: 100  Bit Score: 138.40  E-value: 3.93e-38
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1655 WSPACERSFQELKQKLVTAPVLTVPDGSGNFVIYSDASKKGLGCVLMQQG-----KVVAYASRQLKIHEQNYPTHDLELA 1729
Cdd:pfam17919    1 WTEECQKAFEKLKQALTSAPVLAHPDPDKPFILETDASDYGIGAVLSQEDddggeRPIAYASRKLSPAERNYSTTEKELL 80
                           90       100
                   ....*....|....*....|
gi 1731019548 1730 AVVFALKIWRHYLYGEKIQI 1749
Cdd:pfam17919   81 AIVFALKKFRHYLLGRKFTV 100
RVT_1 pfam00078
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
1432-1592 6.27e-38

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses.


Pssm-ID: 395031 [Multi-domain]  Cd Length: 189  Bit Score: 141.67  E-value: 6.27e-38
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1432 VKKKD-GSMRLC----IDYRELNKVTVK-------NRYPLPRIDDLFDQLQGATVFSKIDLRSGYHQLRIRDGDIPKTAF 1499
Cdd:pfam00078    1 IPKKGkGKYRPIsllsIDYKALNKIIVKrlkpenlDSPPQPGFRPGLAKLKKAKWFLKLDLKKAFDQVPLDELDRKLTAF 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1500 R-----------SRYGHYEFVVMSFGLTNAPAVFMDLMNRVFKEFL---DSFVIVFIDDILIYSKTEAEHEEHLHQVLET 1565
Cdd:pfam00078   81 TtppininwngeLSGGRYEWKGLPQGLVLSPALFQLFMNELLRPLRkraGLTLVRYADDILIFSKSEEEHQEALEEVLEW 160
                          170       180
                   ....*....|....*....|....*....
gi 1731019548 1566 LRANKLYAKFSKCEFWL--RKVTFLGHVV 1592
Cdd:pfam00078  161 LKESGLKINPEKTQFFLksKEVKYLGVTL 189
PLN02554 PLN02554
UDP-glycosyltransferase family protein
177-487 7.60e-38

UDP-glycosyltransferase family protein


Pssm-ID: 215304  Cd Length: 481  Bit Score: 150.32  E-value: 7.60e-38
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  177 SQPFLVPGLPHkIEITKSQLPGSLIKSPDFddfrdkitkaeQEAYGVVVNSFSELE-------NGYYQNYERAiskklWC 249
Cdd:PLN02554   180 TRPYPVKCLPS-VLLSKEWLPLFLAQARRF-----------REMKGILVNTVAELEpqalkffSGSSGDLPPV-----YP 242
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  250 IGPVsLCNENSIEKYnrgnkASIEQSNCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLGQCLESSTRPFIW-------- 321
Cdd:PLN02554   243 VGPV-LHLENSGDDS-----KDEKQSEILRWLDEQPPKSVVFLCFGSMGGFSEEQAREIAIALERSGHRFLWslrraspn 316
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  322 VIKNRDENCSELEKWLSEEEFERkTKGRGLIIrGWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQFL 401
Cdd:PLN02554   317 IMKEPPGEFTNLEEILPEGFLDR-TKDIGKVI-GWAPQVAVLAKPAIGGFVTHCGWNSILESLWFGVPMAAWPLYAEQKF 394
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  402 NEKLVVEILKIGVRVGVE--GAVRWGEEERVGVmakkEEIEKAIEMVMDggeEGEERRRRVGDLSKMAPKAMENGGSSYV 479
Cdd:PLN02554   395 NAFEMVEELGLAVEIRKYwrGDLLAGEMETVTA----EEIERGIRCLME---QDSDVRKRVKEMSEKCHVALMDGGSSHT 467

                   ....*...
gi 1731019548  480 NLSLFIED 487
Cdd:PLN02554   468 ALKKFIQD 475
PLN03015 PLN03015
UDP-glucosyl transferase
12-481 3.13e-35

UDP-glucosyl transferase


Pssm-ID: 178589 [Multi-domain]  Cd Length: 470  Bit Score: 142.14  E-value: 3.13e-35
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   12 QPHFVLVPLMAQGHMIPMIDIATLLAR-RGVFVTFVT-TPYNATRLESFFARAKQSSLSISLLEIPfpclqvglPLGCEN 89
Cdd:PLN03015     3 QPHALLVASPGLGHLIPILELGNRLSSvLNIHVTILAvTSGSSSPTETEAIHAAAARTTCQITEIP--------SVDVDN 74
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   90 LdTLPSRSLLRNFYKALSLLQQPLEQFLSRHHLNPTCIISDkylywtaqtahkfkcprvvFHGTGCFSL----------- 158
Cdd:PLN03015    75 L-VEPDATIFTKMVVKMRAMKPAVRDAVKSMKRKPTVMIVD-------------------FFGTALMSIaddvgvtakyv 134
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  159 -LSSHN----LQLYSP--HTSIDSN----SQPFLVPGLPhkiEITKSQLPGSLIKSPDfDDFRDKITKAEQ--EAYGVVV 225
Cdd:PLN03015   135 yIPSHAwflaVMVYLPvlDTVVEGEyvdiKEPLKIPGCK---PVGPKELMETMLDRSD-QQYKECVRSGLEvpMSDGVLV 210
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  226 NSFSELENGYY------QNYERAISKKLWCIGPVSLCNENsIEKYNrgnkaSIeqsncLNWLDSMIPKSVLYICLGSLCR 299
Cdd:PLN03015   211 NTWEELQGNTLaalredMELNRVMKVPVYPIGPIVRTNVH-VEKRN-----SI-----FEWLDKQGERSVVYVCLGSGGT 279
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  300 MLPSQLIQLGQCLESSTRPFIWVI-----------KNRDENCSELEkwlseEEFERKTKGRGLIIRGWAPQLLILSHWST 368
Cdd:PLN03015   280 LTFEQTVELAWGLELSGQRFVWVLrrpasylgassSDDDQVSASLP-----EGFLDRTRGVGLVVTQWAPQVEILSHRSI 354
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  369 GGFLTHCGWNSTVEGIGNGVPMITWPQFAEQFLNEKLVVEilKIGVrvgvegAVRWGEEERVGVMAKKEEIEKAIEMVMD 448
Cdd:PLN03015   355 GGFLSHCGWSSVLESLTKGVPIVAWPLYAEQWMNATLLTE--EIGV------AVRTSELPSEKVIGREEVASLVRKIVAE 426
                          490       500       510
                   ....*....|....*....|....*....|...
gi 1731019548  449 GGEEGEERRRRVGDLSKMAPKAMENGGSSYVNL 481
Cdd:PLN03015   427 EDEEGQKIRAKAEEVRVSSERAWSHGGSSYNSL 459
RVP_2 pfam08284
Retroviral aspartyl protease; Single domain aspartyl proteases from retroviruses, ...
1173-1304 3.97e-32

Retroviral aspartyl protease; Single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases.


Pssm-ID: 400537  Cd Length: 134  Bit Score: 122.92  E-value: 3.97e-32
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1173 QGRVFATTRQEAERAGTVVTGTLPILGHYAFVLFDSGSSHSFISSVFVQHVGLEVEPLGSVLSVSTPSGEVLLSKEQIKA 1252
Cdd:pfam08284    2 QGRVNHLSAEEAEASPDVIQGTFLVNSIPATVLFDSGATHSFISHAFVGKLKLPVESLSNPLCIETPTGGSVTTNLICPS 81
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|..
gi 1731019548 1253 CRVEIANRMLDVTLLVLDMQDFDVILGMDWLSANHANIDCYGKEVVFNPPSE 1304
Cdd:pfam08284   82 CPIEIQGISFLADLILLDMKDLDVILGMDWLSKNKANIDCARRTVTLTKERE 133
PLN00414 PLN00414
glycosyltransferase family protein
14-448 9.31e-32

glycosyltransferase family protein


Pssm-ID: 177807 [Multi-domain]  Cd Length: 446  Bit Score: 131.30  E-value: 9.31e-32
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   14 HFVLVPLMAQGHMIPMIDIATLLARRGVFVTFVTTPYNATRLE--SFFARakqsslSISLLEIPFPCLQvGLPLGCENLD 91
Cdd:PLN00414     6 HAFMYPWFGFGHMIPYLHLANKLAEKGHRVTFFLPKKAHKQLQplNLFPD------SIVFEPLTLPPVD-GLPFGAETAS 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   92 TLPSrSLLRNFYKALSLLQQPLEQFLsrHHLNPTCIISDkYLYWTAQTAHKFKCPRVVFH--GTGCFSLLSSHNLQLYSP 169
Cdd:PLN00414    79 DLPN-STKKPIFDAMDLLRDQIEAKV--RALKPDLIFFD-FVHWVPEMAKEFGIKSVNYQiiSAACVAMVLAPRAELGFP 154
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  170 htsidsnsqPflvPGLPhkieITKSQLPG------SLIKSPDfdDFRDKITKAEQEAYGVVVNSFSELENGYYQNYERAI 243
Cdd:PLN00414   155 ---------P---PDYP----LSKVALRGhdanvcSLFANSH--ELFGLITKGLKNCDVVSIRTCVELEGNLCDFIERQC 216
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  244 SKKLWCIGPVSLcnensiEKYNRGNKASIEQSNclNWLDSMIPKSVLYICLGSLCRMLPSQLIQLGQCLESSTRPF-IWV 322
Cdd:PLN00414   217 QRKVLLTGPMLP------EPQNKSGKPLEDRWN--HWLNGFEPGSVVFCAFGTQFFFEKDQFQEFCLGMELTGLPFlIAV 288
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  323 IKNRdeNCSELEKWLSEEeFERKTKGRGLIIRGWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQFLN 402
Cdd:PLN00414   289 MPPK--GSSTVQEALPEG-FEERVKGRGIVWEGWVEQPLILSHPSVGCFVNHCGFGSMWESLVSDCQIVFIPQLADQVLI 365
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|....*.
gi 1731019548  403 EKLVVEILKIGVRVgvegavrwgEEERVGVMAkKEEIEKAIEMVMD 448
Cdd:PLN00414   366 TRLLTEELEVSVKV---------QREDSGWFS-KESLRDTVKSVMD 401
PLN02562 PLN02562
UDP-glycosyltransferase
8-487 4.77e-29

UDP-glycosyltransferase


Pssm-ID: 215305  Cd Length: 448  Bit Score: 123.07  E-value: 4.77e-29
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548    8 QLELQPHFVLVPLMAQGHMIPMIDIATLLARRGvFVTFVTTPYNATRLESFFARAKqssLSISLLEIPfpclqvglplgc 87
Cdd:PLN02562     2 KVTQRPKIILVPYPAQGHVTPMLKLASAFLSRG-FEPVVITPEFIHRRISATLDPK---LGITFMSIS------------ 65
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   88 ENLDTLPSRSLLRNFYKALSLLQQPLEQFLSR--HHLNPTCIISDKYLYWTAQTAHKFKCPRVVFHGTgcfsLLSSHNLQ 165
Cdd:PLN02562    66 DGQDDDPPRDFFSIENSMENTMPPQLERLLHKldEDGEVACMVVDLLASWAIGVADRCGVPVAGFWPV----MLAAYRLI 141
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  166 LYSPH---TSIDSNS----QPFLVPGLPHKIEITKSQLPGsLIKSPD-----FDDFRDKITKAEQEAYgVVVNSFSELE- 232
Cdd:PLN02562   142 QAIPElvrTGLISETgcprQLEKICVLPEQPLLSTEDLPW-LIGTPKarkarFKFWTRTLERTKSLRW-ILMNSFKDEEy 219
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  233 ---NGYYQNYERAISKKLWCIGPVSLCNENSIEKYNRGNkasiEQSNCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQ-L 308
Cdd:PLN02562   220 ddvKNHQASYNNGQNPQILQIGPLHNQEATTITKPSFWE----EDMSCLGWLQEQKPNSVIYISFGSWVSPIGESNVRtL 295
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  309 GQCLESSTRPFIWVIKnrdencSELEKWLSEEEFERKTKgRGLIIrGWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGV 388
Cdd:PLN02562   296 ALALEASGRPFIWVLN------PVWREGLPPGYVERVSK-QGKVV-SWAPQLEVLKHQAVGCYLTHCGWNSTMEAIQCQK 367
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  389 PMITWPQFAEQFLNEKLVVEILKIGVRVGVEGavrwgeeervgvmakKEEIEKAIEMVMDGGEEGEErrrrvgdLSKMAP 468
Cdd:PLN02562   368 RLLCYPVAGDQFVNCAYIVDVWKIGVRISGFG---------------QKEVEEGLRKVMEDSGMGER-------LMKLRE 425
                          490       500
                   ....*....|....*....|.
gi 1731019548  469 KAM--ENGGSSYVNLSLFIED 487
Cdd:PLN02562   426 RAMgeEARLRSMMNFTTLKDE 446
PLN02764 PLN02764
glycosyltransferase family protein
11-439 3.59e-24

glycosyltransferase family protein


Pssm-ID: 178364  Cd Length: 453  Bit Score: 108.61  E-value: 3.59e-24
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   11 LQPHFVLVPLMAQGHMIPMIDIATLLARRGVFVTFVTTPYNATRLE--SFFARakqsslSISLLEIPFPCLQvGLPLGCE 88
Cdd:PLN02764     4 LKFHVLMYPWFATGHMTPFLFLANKLAEKGHTVTFLLPKKALKQLEhlNLFPH------NIVFRSVTVPHVD-GLPVGTE 76
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   89 NLDTLP--SRSLLRNfykALSLLQQPLEQFLsrHHLNPTCIISDkYLYWTAQTAHKF---KCPRVVFHGTGCFSLLSSHN 163
Cdd:PLN02764    77 TVSEIPvtSADLLMS---AMDLTRDQVEVVV--RAVEPDLIFFD-FAHWIPEVARDFglkTVKYVVVSASTIASMLVPGG 150
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  164 LQLYSPhtsidsnsqpflvPGLPHKIEITKSQLPGSL--IKSPDFDD----FRDKITKAEQEAYGVVVNSFSELENGYYQ 237
Cdd:PLN02764   151 ELGVPP-------------PGYPSSKVLLRKQDAYTMknLEPTNTIDvgpnLLERVTTSLMNSDVIAIRTAREIEGNFCD 217
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  238 NYERAISKKLWCIGPVslcnensiekYNRGNKASIEQSNCLNWLDSMIPKSVLYICLGSLCRMLPSQLIQLGQCLESSTR 317
Cdd:PLN02764   218 YIEKHCRKKVLLTGPV----------FPEPDKTRELEERWVKWLSGYEPDSVVFCALGSQVILEKDQFQELCLGMELTGS 287
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  318 PFIWVIKNrDENCSELEKWLSEEeFERKTKGRGLIIRGWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITWPQFA 397
Cdd:PLN02764   288 PFLVAVKP-PRGSSTIQEALPEG-FEERVKGRGVVWGGWVQQPLILSHPSVGCFVSHCGFGSMWESLLSDCQIVLVPQLG 365
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|....*..
gi 1731019548  398 EQFLNEKLVVEILKIGVRVGVEgAVRWGEEERV-----GVMAKKEEI 439
Cdd:PLN02764   366 DQVLNTRLLSDELKVSVEVARE-ETGWFSKESLrdainSVMKRDSEI 411
RT_ZFREV_like cd03715
RT_ZFREV_like: A subfamily of reverse transcriptases (RTs) found in sequences similar to the ...
1389-1591 6.36e-23

RT_ZFREV_like: A subfamily of reverse transcriptases (RTs) found in sequences similar to the intact endogenous retrovirus ZFERV from zebrafish and to Moloney murine leukemia virus RT. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs. Phylogenetic analysis suggests that ZFERV belongs to a distinct group of retroviruses.


Pssm-ID: 239685 [Multi-domain]  Cd Length: 210  Bit Score: 98.96  E-value: 6.36e-23
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1389 PISRAPYRMAPAELKELKVQLQELLDKGFIRPSVSPWGAPVLFVKKKDG-SMRLCIDYRELNKVTVKNRYPLPRIDDLFD 1467
Cdd:cd03715      1 PVNQKQYPLPREAREGITPHIQELLEAGILVPCQSPWNTPILPVKKPGGnDYRMVQDLRLVNQAVLPIHPAVPNPYTLLS 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1468 QLQGA-TVFSKIDLRSGYHQLRIRDGDIPKTAFRSRYGHYEFVVMSFGLTNAPAVFMDLMNRVFKEF----LDSFVIVFI 1542
Cdd:cd03715     81 LLPPKhQWYTVLDLANAFFSLPLAPDSQPLFAFEWEGQQYTFTRLPQGFKNSPTLFHEALARDLAPFplehEGTILLQYV 160
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*....
gi 1731019548 1543 DDILIYSKTEAEHEEHLHQVLETLRANKLYAKFSKCEFWLRKVTFLGHV 1591
Cdd:cd03715    161 DDLLLAADSEEDCLKGTDALLTHLGELGYKVSPKKAQICRAEVKFLGVV 209
PLN02208 PLN02208
glycosyltransferase family protein
9-448 3.97e-22

glycosyltransferase family protein


Pssm-ID: 177858  Cd Length: 442  Bit Score: 102.02  E-value: 3.97e-22
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548    9 LELQPHFVLVPLMAQGHMIPMIDIATLLARRGVFVTFVTTPYNATRLE--SFFArakqSSLSISLLEIPfPClqVGLPLG 86
Cdd:PLN02208     1 MEPKFHAFMFPWFAFGHMIPFLHLANKLAEKGHRVTFLLPKKAQKQLEhhNLFP----DSIVFHPLTIP-PV--NGLPAG 73
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548   87 CENLDTLPSrSLLRNFYKALSLLQQPLEQflSRHHLNPTCIISDkYLYWTAQTAHKFKCPRVvfhgtgCFSLLSSHNLQl 166
Cdd:PLN02208    74 AETTSDIPI-SMDNLLSEALDLTRDQVEA--AVRALRPDLIFFD-FAQWIPEMAKEHMIKSV------SYIIVSATTIA- 142
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  167 yspHTSIDSNSQPFLVPGLPH-KIEITKSQLPGSLIKSPDFDDFRDKITKAEQEAYGVVVNSFSELENGYYQNYERAISK 245
Cdd:PLN02208   143 ---HTHVPGGKLGVPPPGYPSsKVLFRENDAHALATLSIFYKRLYHQITTGLKSCDVIALRTCKEIEGKFCDYISRQYHK 219
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  246 KLWCIGPVSLCNENSiekynrgnKASIEQSNclNWLDSMIPKSVLYICLGSLCRMLPSQLIQLGQCLESSTRPFIWVIKN 325
Cdd:PLN02208   220 KVLLTGPMFPEPDTS--------KPLEEQWS--HFLSGFPPKSVVFCSLGSQIILEKDQFQELCLGMELTGLPFLIAVKP 289
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  326 rDENCSELEKWLSEEeFERKTKGRGLIIRGWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQFLNEKL 405
Cdd:PLN02208   290 -PRGSSTVQEGLPEG-FEERVKGRGVVWGGWVQQPLILDHPSIGCFVNHCGPGTIWESLVSDCQMVLIPFLSDQVLFTRL 367
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|...
gi 1731019548  406 VVEILKIGVRVgvegavrwgEEERVGVMAkKEEIEKAIEMVMD 448
Cdd:PLN02208   368 MTEEFEVSVEV---------SREKTGWFS-KESLSNAIKSVMD 400
Retrotrans_gag pfam03732
Retrotransposon gag protein; Gag or Capsid-like proteins from LTR retrotransposons. There is a ...
943-1038 4.67e-20

Retrotransposon gag protein; Gag or Capsid-like proteins from LTR retrotransposons. There is a central motif QGXXEXXXXXFXXLXXH that is common to Retroviridae gag-proteins, but is poorly conserved.


Pssm-ID: 367628  Cd Length: 97  Bit Score: 87.00  E-value: 4.67e-20
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  943 AVFFLEDRGTAWWETAERmlGGDVSKITWEQFKENFYAKFFSANVKHAKLQEFLNLEQGDMTVEQYDAEFDMLSRFAPDM 1022
Cdd:pfam03732    3 AVHSLRGAALTWWKSLVA--RSIDAFDSWDELKDAFLKRFFPSIRKDLLRNELRSLRQGTESVREYVERFKRLARQLPHH 80
                           90
                   ....*....|....*.
gi 1731019548 1023 VRDEAARTEKFVRGLR 1038
Cdd:pfam03732   81 GRDEEALISAFLRGLR 96
RT_Rtv cd01645
RT_Rtv: Reverse transcriptases (RTs) from retroviruses (Rtvs). RTs catalyze the conversion of ...
1402-1566 2.28e-19

RT_Rtv: Reverse transcriptases (RTs) from retroviruses (Rtvs). RTs catalyze the conversion of single-stranded RNA into double-stranded viral DNA for integration into host chromosomes. Proteins in this subfamily contain long terminal repeats (LTRs) and are multifunctional enzymes with RNA-directed DNA polymerase, DNA directed DNA polymerase, and ribonuclease hybrid (RNase H) activities. The viral RNA genome enters the cytoplasm as part of a nucleoprotein complex, and the process of reverse transcription generates in the cytoplasm forming a linear DNA duplex via an intricate series of steps. This duplex DNA is colinear with its RNA template, but contains terminal duplications known as LTRs that are not present in viral RNA. It has been proposed that two specialized template switches, known as strand-transfer reactions or "jumps", are required to generate the LTRs.


Pssm-ID: 238823 [Multi-domain]  Cd Length: 213  Bit Score: 88.88  E-value: 2.28e-19
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1402 LKELKVQLQELLDKGFIRPSVSPWGAPVLFVKKKDGSMRLCIDYRELNKVTVKnryplpriddlFDQLQ-GATVFSKI-- 1478
Cdd:cd01645     14 LEALTELVTEQLKEGHIEPSTSPWNTPVFVIKKKSGKWRLLHDLRAVNAQTQD-----------MGALQpGLPHPAALpk 82
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1479 -------DLRSGYHQLRIRDGDIPKTAF-------RSRYGHYEFVVMSFGLTNAPAVFMDLMNRVFKEF----LDSFVIV 1540
Cdd:cd01645     83 gwplivlDLKDCFFSIPLHPDDRERFAFtvpsinnKGPAKRYQWKVLPQGMKNSPTICQSFVAQALEPFrkqyPDIVIYH 162
                          170       180
                   ....*....|....*....|....*.
gi 1731019548 1541 FIDDILIYSKTEAEHEEHLHQVLETL 1566
Cdd:cd01645    163 YMDDILIASDLEGQLREIYEELRQTL 188
Integrase_H2C2 pfam17921
Integrase zinc binding domain; This zinc binding domain is found in a wide variety of ...
1891-1949 5.79e-16

Integrase zinc binding domain; This zinc binding domain is found in a wide variety of integrase proteins.


Pssm-ID: 465569 [Multi-domain]  Cd Length: 58  Bit Score: 73.82  E-value: 5.79e-16
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 1731019548 1891 DSAVKTELLTEAHSSpfTMHPGSTKMYQDLRSVYWWRGMKRDVADFVSRCLVCQQVKAP 1949
Cdd:pfam17921    2 PKSLRKEILKEAHDS--GGHLGIEKTLARLRRRYWWPGMRKDVKKYVKSCETCQRRKPS 58
UDPGT pfam00201
UDP-glucoronosyl and UDP-glucosyl transferase;
289-448 7.84e-12

UDP-glucoronosyl and UDP-glucosyl transferase;


Pssm-ID: 278624 [Multi-domain]  Cd Length: 499  Bit Score: 70.13  E-value: 7.84e-12
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  289 VLYICLGSLCRMLPSQ-LIQLGQCLESSTRPFIWviknrdencselekwlSEEEFERKTKGRGLIIRGWAPQLLILSHWS 367
Cdd:pfam00201  277 VVVFSLGSMVSNIPEEkANAIASALAQIPQKVLW----------------RFDGTKPSTLGNNTRLVKWLPQNDLLGHPK 340
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  368 TGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQFLNEKLVVEilkigvrVGVEGAVRWGEeervgvMAkKEEIEKAIEMVM 447
Cdd:pfam00201  341 TRAFITHAGSNGVYEAICHGVPMVGMPLFGDQMDNAKHMEA-------KGAAVTLNVLT------MT-SEDLLNALKEVI 406

                   .
gi 1731019548  448 D 448
Cdd:pfam00201  407 N 407
RNase_HI_RT_DIRS1 cd09275
DIRS1 family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes ...
1686-1800 9.83e-11

DIRS1 family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1 and the vertebrate retroviruses. The structural features of DIRS1-group elements are different from typical LTR elements. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260007  Cd Length: 120  Bit Score: 61.15  E-value: 9.83e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1686 VIYSDASKKGLGCVLMQQGKVVAYASrqlkiHEQNYPTHDLELAAVVFALKIWRHYLYGEKIQIYTDHKS----LKYFFT 1761
Cdd:cd09275      1 VLFTDASLSGWGAYLLNSRAHGPWSA-----DERNKHINLLELKAVLLALQHFAAELKNRKILIRTDNTTavayINKQGG 75
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|...
gi 1731019548 1762 QK--ELNMRQRRWLELVKDYDCEI--LYHPGKANVVADALSRK 1800
Cdd:cd09275     76 TSspPLLALARQILLWCEQRNIWLraSHIPGVLNTEADRLSRL 118
Ty3_capsid pfam19259
Ty3 transposon capsid-like protein; This entry corresponds to the capsid protein found in the ...
905-1062 6.99e-10

Ty3 transposon capsid-like protein; This entry corresponds to the capsid protein found in the Ty3 transposons of yeast as well as other transposable elements.


Pssm-ID: 437091 [Multi-domain]  Cd Length: 197  Bit Score: 60.95  E-value: 6.99e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  905 PKTFDGSMDnPTKAQMWLTSIETIFRYMKCPEDQK--VQCAVFFL---EDRGTAWWETAermlggDVSKITWEQFKENFY 979
Cdd:pfam19259   14 ILPFRGRKD-VLKLKSFISEIMLQMSMIFWPNDAEriVFCARHLTgpaAQWFHDFVQEQ------GILDATFDTFIKAFK 86
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  980 AKFFSANVKHAKLQEFLNLEQGDMTVEQYDAEFDMLSRFAPDMVRDEAARTEKFVRGLRLDLQGIVRALRPATHADALRI 1059
Cdd:pfam19259   87 QHFYGKPDINKLFNDIVNLSEAKLGIERYNSHFNRLWDLLPPDFLSEKAAIMFYIRGLKPETYIIVRLAKPSTLKEAMEI 166

                   ...
gi 1731019548 1060 ALD 1062
Cdd:pfam19259  167 AYE 169
retropepsin_like cd00303
Retropepsins; pepsin-like aspartate proteases; The family includes pepsin-like aspartate ...
1219-1283 9.35e-10

Retropepsins; pepsin-like aspartate proteases; The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements, as well as eukaryotic dna-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.


Pssm-ID: 133136  Cd Length: 92  Bit Score: 57.35  E-value: 9.35e-10
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1731019548 1219 FVQHVGLEVEPLGSVLSVSTPSGEVLLSKEQIKACRVEIANRMLDVTLLVLDMQDFDVILGMDWL 1283
Cdd:cd00303     27 LAKKLGLPPRLLPTPLKVKGANGSSVKTLGVILPVTIGIGGKTFTVDFYVLDLLSYDVILGRPWL 91
MGT TIGR01426
glycosyltransferase, MGT family; This model describes the MGT (macroside glycosyltransferase) ...
352-409 6.59e-09

glycosyltransferase, MGT family; This model describes the MGT (macroside glycosyltransferase) subfamily of the UDP-glucuronosyltransferase family. Members include a number of glucosyl transferases for macrolide antibiotic inactivation, but also include transferases of glucose-related sugars for macrolide antibiotic production. [Cellular processes, Toxin production and resistance]


Pssm-ID: 273616 [Multi-domain]  Cd Length: 392  Bit Score: 60.47  E-value: 6.59e-09
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1731019548  352 IIRGWAPQLLILSHwsTGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQFLNEKLVVEI 409
Cdd:TIGR01426  278 EVRQWVPQLEILKK--ADAFITHGGMNSTMEALFNGVPMVAVPQGADQPMTARRIAEL 333
YjiC COG1819
UDP:flavonoid glycosyltransferase YjiC, YdhE family [Carbohydrate transport and metabolism];
351-416 8.15e-09

UDP:flavonoid glycosyltransferase YjiC, YdhE family [Carbohydrate transport and metabolism];


Pssm-ID: 441424 [Multi-domain]  Cd Length: 268  Bit Score: 59.10  E-value: 8.15e-09
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1731019548  351 LIIRGWAPQLLILSHwsTGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQFLNEKLVVEiLKIGVRV 416
Cdd:COG1819    171 VRVVDYVPQDALLPR--ADAVVHHGGAGTTAEALRAGVPQVVVPFGGDQPLNAARVER-LGAGLAL 233
RT_DIRS1 cd03714
RT_DIRS1: Reverse transcriptases (RTs) occurring in the DIRS1 group of retransposons. Members ...
1478-1589 9.58e-09

RT_DIRS1: Reverse transcriptases (RTs) occurring in the DIRS1 group of retransposons. Members of the subfamily include the Dictyostelium DIRS-1, Volvox carteri kangaroo, and Panagrellus redivivus PAT elements. These elements differ from LTR and conventional non-LTR retrotransposons. They contain split direct repeat (SDR) termini, and have been proposed to integrate via double-stranded closed-circle DNA intermediates assisted by an encoded recombinase which is similar to gamma-site-specific integrase.


Pssm-ID: 239684 [Multi-domain]  Cd Length: 119  Bit Score: 55.43  E-value: 9.58e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1478 IDLRSGYHQLRIRDGDIPKTAFRSRYGHYEFVVMSFGLTNAPAVFMDLMNRVFKEF--LDSFVIVFIDDILIYSKTEAEH 1555
Cdd:cd03714      1 VDLKDAYFHIPILPRSRDLLGFAWQGETYQFKALPFGLSLAPRVFTKVVEALLAPLrlLGVRIFSYLDDLLIIASSIKTS 80
                           90       100       110
                   ....*....|....*....|....*....|....*.
gi 1731019548 1556 EEHLHQVLETLRANK-LYAKFSKCEFWL-RKVTFLG 1589
Cdd:cd03714     81 EAVLRHLRATLLANLgFTLNLEKSKLGPtQRITFLG 116
RT_like cd00304
RT_like: Reverse transcriptase (RT, RNA-dependent DNA polymerase)_like family. An RT gene is ...
1509-1592 1.75e-08

RT_like: Reverse transcriptase (RT, RNA-dependent DNA polymerase)_like family. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs.


Pssm-ID: 238185 [Multi-domain]  Cd Length: 98  Bit Score: 53.89  E-value: 1.75e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1509 VVMSFGLTNAPAVFMDLMNRVFKEF----LDSFVIVFIDDILIYSKTEaEHEEHLHQVLETLRANKLYAKFSKCEF--WL 1582
Cdd:cd00304     10 IPLPQGSPLSPALANLYMEKLEAPIlkqlLDITLIRYVDDLVVIAKSE-QQAVKKRELEEFLARLGLNLSDEKTQFteKE 88
                           90
                   ....*....|
gi 1731019548 1583 RKVTFLGHVV 1592
Cdd:cd00304     89 KKFKFLGILV 98
RNase_H_like cd06222
Ribonuclease H-like superfamily, including RNase H, HI, HII, HIII, and RNase-like domain IV of ...
1687-1799 1.02e-07

Ribonuclease H-like superfamily, including RNase H, HI, HII, HIII, and RNase-like domain IV of spliceosomal protein Prp8; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. It is widely present in various organisms, including bacteria, archaea, and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site residues and have the same catalytic mechanism and functions in cells. RNase H is involved in DNA replication, repair and transcription. An important RNase H function is to remove Okazaki fragments during DNA replication. RNase H inhibitors have been explored as anti-HIV drug targets since RNase H inactivation inhibits reverse transcription. This model also includes the Prp8 domain IV, which adopts the RNase fold but shows low sequence homology; domain IV is implicated in key spliceosomal interactions.


Pssm-ID: 259998 [Multi-domain]  Cd Length: 121  Bit Score: 52.70  E-value: 1.02e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1687 IYSDASKK------GLGCVLMQQGKVVaYASRQLKIHEQNyPTHdLELAAVVFALKIWRHYLYgEKIQIYTDHKSL---- 1756
Cdd:cd06222      1 INVDGSCRgnpgpaGIGGVLRDHEGGW-LGGFALKIGAPT-ALE-AELLALLLALELALDLGY-LKVIIESDSKYVvdli 76
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*
gi 1731019548 1757 --KYFFTQKELNMRQRRWLELVKDYDCEILYHPGKANVVADALSR 1799
Cdd:cd06222     77 nsGSFKWSPNILLIEDILLLLSRFWSVKISHVPREGNQVADALAK 121
PHA03378 PHA03378
EBNA-3B; Provisional
497-672 4.35e-07

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 55.46  E-value: 4.35e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  497 AQPSRNRNRAVAELPAAPGESSKPICAAE--RPSEQPSLARRsrveaasaRPQPTRQPAEsrrfdPTQAQPKRQSVAVAE 574
Cdd:PHA03378   674 YQPSPTGANTMLPIQWAPGTMQPPPRAPTpmRPPAAPPGRAQ--------RPAAATGRAR-----PPAAAPGRARPPAAA 740
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  575 RSRGSRTTNSRRAARNPEAEASRAKPSRVSPVASRAASPPVASRAAREIP--VPSRAASAQAEPV-LSSSSRAARTKLAP 651
Cdd:PHA03378   741 PGRARPPAAAPGRARPPAAAPGRARPPAAAPGAPTPQPPPQAPPAPQQRPrgAPTPQPPPQAGPTsMQLMPRAAPGQQGP 820
                          170       180
                   ....*....|....*....|.
gi 1731019548  652 STYILRGISSVGARALQPPSR 672
Cdd:PHA03378   821 TKQILRQLLTGGVKRGRPSLK 841
PRK07003 PRK07003
DNA polymerase III subunit gamma/tau;
495-646 5.89e-07

DNA polymerase III subunit gamma/tau;


Pssm-ID: 235906 [Multi-domain]  Cd Length: 830  Bit Score: 54.86  E-value: 5.89e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  495 SAAQPSRNRNRAVAELPAAPGESSKPICAAERPSEQPSLARRS--------RVEAASARPQPTRQPAESRRFDPTQAQPK 566
Cdd:PRK07003   387 AAAAVGASAVPAVTAVTGAAGAALAPKAAAAAAATRAEAPPAApappatadRGDDAADGDAPVPAKANARASADSRCDER 466
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  567 RQSVAVAERSRGSRTTNSRRAARNpEAEASRAKPSRVSPVASRAASPPVASRAAREIPVPSRAASAQAEPVLSSSSRAAR 646
Cdd:PRK07003   467 DAQPPADSGSASAPASDAPPDAAF-EPAPRAAAPSAATPAAVPDARAPAAASREDAPAAAAPPAPEARPPTPAAAAPAAR 545
PRK07003 PRK07003
DNA polymerase III subunit gamma/tau;
496-645 1.33e-06

DNA polymerase III subunit gamma/tau;


Pssm-ID: 235906 [Multi-domain]  Cd Length: 830  Bit Score: 53.70  E-value: 1.33e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  496 AAQPSRNRNRAVAELPAAPGESSKPicAAERPSEQPSLARRSRVEAASARPQPTRQPAEsrrfdPTQAQPKRQSVAVAE- 574
Cdd:PRK07003   480 PASDAPPDAAFEPAPRAAAPSAATP--AAVPDARAPAAASREDAPAAAAPPAPEARPPT-----PAAAAPAARAGGAAAa 552
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1731019548  575 ----RSRGSRTTNSRRAARNPEAEASRAKPSRVSPVASRAASPPVASRAAREIPVPSRAASAQAEPvlSSSSRAA 645
Cdd:PRK07003   553 ldvlRNAGMRVSSDRGARAAAAAKPAAAPAAAPKPAAPRVAVQVPTPRARAATGDAPPNGAARAEQ--AAESRGA 625
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
496-623 1.69e-06

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 53.45  E-value: 1.69e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  496 AAQPSRNRNRAVAELPAAPGEsskpicAAERPSEQPslARRSRVEAASARPQPTRQPAESRRFDPTQAQPKRQSVAVAER 575
Cdd:PRK07764   393 APAAAAPSAAAAAPAAAPAPA------AAAPAAAAA--PAPAAAPQPAPAPAPAPAPPSPAGNAPAGGAPSPPPAAAPSA 464
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*...
gi 1731019548  576 SRGSRTTNSRRAARNPEAEASRAKPSRVSPVASRAASPPVASRAAREI 623
Cdd:PRK07764   465 QPAPAPAAAPEPTAAPAPAPPAAPAPAAAPAAPAAPAAPAGADDAATL 512
recX PRK14136
recombination regulator RecX; Provisional
501-763 1.96e-06

recombination regulator RecX; Provisional


Pssm-ID: 237620 [Multi-domain]  Cd Length: 309  Bit Score: 52.31  E-value: 1.96e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  501 RNRNRAVAELPAAPGESSKpicAAERPSEQPSLARRSRVEAASARPQpTRQPAES----RRFDPTQAQPKRQSVAVAERS 576
Cdd:PRK14136     3 RRRQGADPQEADHPARAAR---AGRPHASRETDRTVSGEGRPAGRTA-TRASDDAlvsfEIAAPDEPFDDDESFDAHDRA 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  577 RGSRttnSRRAARNPEAEASRAKPSRVSPVASRAASPPVA----SRAAREIPVPSRAASaqaePVLSSSSRAARTKLAPS 652
Cdd:PRK14136    79 RRRV---SGVGVRDAGAPGGRAADARAANLSSAAKRAEAAgdvyTRTSQHPRRTRRAAG----PFHSDSSPSASSEDDGA 151
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  653 TyiLRGISSVGARALQPPSRDFVAWKAYCLAVRTRQvdLQIYFGYDECRDG-----ECRNDLCDMLYADAMCMYTAIRVP 727
Cdd:PRK14136   152 A--RSRASSRPARSLKGRALGYLSRREYSRAELARK--LAPYADESDSVEPlldalEREGWLSDARFAESLVHRRASRVG 227
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|....*
gi 1731019548  728 VSLILLESYLHGcpsgsppIEDCVVE---------DEARARASWR 763
Cdd:PRK14136   228 SARIVSELKRHA-------VGDALVEsvgaqlretEFERAQAVWR 265
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
496-670 2.05e-06

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 53.34  E-value: 2.05e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  496 AAQPSR-NRNRAVAELPAAPGESSKPICAAERPSEQPSLARRSRVEAASARPQPTRQPAESrrfdPTQAQPKRQSVAVAE 574
Cdd:PRK12323   362 AFRPGQsGGGAGPATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAA----PARRSPAPEALAAAR 437
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  575 ----RSRGSRTTNSRRAARNPEAEASRAKPSRVSPVASRAASPPVASRAAREIPVPSRAASAQAEPVLSSSSRAARTKLA 650
Cdd:PRK12323   438 qasaRGPGGAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPPWEELPPEFASPAPAQPDAA 517
                          170       180
                   ....*....|....*....|
gi 1731019548  651 PSTYILRGISSVGARALQPP 670
Cdd:PRK12323   518 PAGWVAESIPDPATADPDDA 537
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
486-672 2.29e-06

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 53.25  E-value: 2.29e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  486 EDPPSILLCSAAQPSRNRNRAVAELPAAPGESSKPICAAERPSEQPSLARRSRV-----EAASARPQPTRQPAESRRFDP 560
Cdd:PHA03307   141 VGSPGPPPAASPPAAGASPAAVASDAASSRQAALPLSSPEETARAPSSPPAEPPpstppAAASPRPPRRSSPISASASSP 220
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  561 TQAQPKRQSVAVAERSRGSRTTNSRRAARNPEAEASRAKPSRVS----------PVASRAASPPVASRAAREIPVPSRAA 630
Cdd:PHA03307   221 APAPGRSAADDAGASSSDSSSSESSGCGWGPENECPLPRPAPITlptriweasgWNGPSSRPGPASSSSSPRERSPSPSP 300
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|..
gi 1731019548  631 SAQAEPVLSSSSRAARTKLAPSTYILRGISSVGARALQPPSR 672
Cdd:PHA03307   301 SSPGSGPAPSSPRASSSSSSSRESSSSSTSSSSESSRGAAVS 342
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
486-671 3.82e-06

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 52.48  E-value: 3.82e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  486 EDPPSILLCSAAQPSRNRNRAvAELPAAPGESSK---PICAA----ERPSEQPSLARRSRVEAASARPQPTRQPAESRRf 558
Cdd:PHA03307   209 RSSPISASASSPAPAPGRSAA-DDAGASSSDSSSsesSGCGWgpenECPLPRPAPITLPTRIWEASGWNGPSSRPGPAS- 286
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  559 dPTQAQPKRQSVAVAERSRGSRTTNSRRAARN--PEAEASRAKPSRVSPVASRAASPPVASRAA---------------- 620
Cdd:PHA03307   287 -SSSSPRERSPSPSPSSPGSGPAPSSPRASSSssSSRESSSSSTSSSSESSRGAAVSPGPSPSRspspsrppppadpssp 365
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1731019548  621 REIPVPSRAASAQAEPVLSSSSRAARtKLAPSTYILRGISSVGARALQPPS 671
Cdd:PHA03307   366 RKRPRPSRAPSSPAASAGRPTRRRAR-AAVAGRARRRDATGRFPAGRPRPS 415
egt PHA03392
ecdysteroid UDP-glucosyltransferase; Provisional
336-416 4.21e-06

ecdysteroid UDP-glucosyltransferase; Provisional


Pssm-ID: 223071 [Multi-domain]  Cd Length: 507  Bit Score: 51.88  E-value: 4.21e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  336 WLSEEEFERKTKGRGLIIRGWAPQLLILSHWSTGGFLTHCGWNSTVEGIGNGVPMITWPQFAEQFLNEKLVVEiLKIGVR 415
Cdd:PHA03392   333 WKYDGEVEAINLPANVLTQKWFPQRAVLKHKNVKAFVTQGGVQSTDEAIDALVPMVGLPMMGDQFYNTNKYVE-LGIGRA 411

                   .
gi 1731019548  416 V 416
Cdd:PHA03392   412 L 412
PHA03247 PHA03247
large tegument protein UL36; Provisional
496-670 6.56e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 51.86  E-value: 6.56e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  496 AAQPSRNRNRAVAELPAAPGESSKPICAAERPSEQPSLARRSRVEAASARPQPtrqPAESRRFDPTQAqPKRQSVAVAER 575
Cdd:PHA03247  2574 APRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDP---PPPSPSPAANEP-DPHPPPTVPPP 2649
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  576 SR------GSRTTNSRRAARNPEAEASRAKPSRVSPvasRAASPPVASRAAREIPVPSRAASAQAEPVLSSSsraarTKL 649
Cdd:PHA03247  2650 ERprddpaPGRVSRPRRARRLGRAAQASSPPQRPRR---RAARPTVGSLTSLADPPPPPPTPEPAPHALVSA-----TPL 2721
                          170       180
                   ....*....|....*....|.
gi 1731019548  650 APSTYILRGISSVGARALQPP 670
Cdd:PHA03247  2722 PPGPAAARQASPALPAAPAPP 2742
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
505-636 9.32e-06

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 50.87  E-value: 9.32e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  505 RAVAELPAAPGESSKPICAAERPSEQPSLARRSRVEAASARPQPTRQPAESRRfdPTQAQPKRQSVAVAERSRGSRTTNS 584
Cdd:PRK14951   365 KPAAAAEAAAPAEKKTPARPEAAAPAAAPVAQAAAAPAPAAAPAAAASAPAAP--PAAAPPAPVAAPAAAAPAAAPAAAP 442
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|..
gi 1731019548  585 RRAARNPEAEAsRAKPSRVSPVASRAASPPVASRAAREIPVPSRAASAQAEP 636
Cdd:PRK14951   443 AAVALAPAPPA-QAAPETVAIPVRVAPEPAVASAAPAPAAAPAAARLTPTEE 493
transpos_IS481 NF033577
IS481 family transposase; null
1950-2078 3.65e-05

IS481 family transposase; null


Pssm-ID: 468094 [Multi-domain]  Cd Length: 283  Bit Score: 47.97  E-value: 3.65e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1950 RQHPAGLLQplsvpgwkwesvsMDfITGLPKTL-RGYTVIWVVVDRLTKSAH--FVPGKSTYTASKwgqlYMTEIVRLHG 2026
Cdd:NF033577   124 RAHPGELWH-------------ID-IKKLGRIPdVGRLYLHTAIDDHSRFAYaeLYPDETAETAAD----FLRRAFAEHG 185
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 1731019548 2027 VPV-SIISDRDARFTSKFwKGLQLAL---GTRLDFSTAFHPQTDGQTERLNQILED 2078
Cdd:NF033577   186 IPIrRVLTDNGSEFRSRA-HGFELALaelGIEHRRTRPYHPQTNGKVERFHRTLKD 240
rve pfam00665
Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...
1963-2063 4.41e-05

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.


Pssm-ID: 459897 [Multi-domain]  Cd Length: 98  Bit Score: 44.23  E-value: 4.41e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1963 PGWKWEsvsMDFITGLPKTLRGYTVIWVVVDRLTK--SAHFVPGKSTYTAskWGQLYMTEIVRLHGVPVSIISDRDARFT 2040
Cdd:pfam00665    1 PNQLWQ---GDFTYIRIPGGGGKLYLLVIVDDFSReiLAWALSSEMDAEL--VLDALERAIAFRGGVPLIIHSDNGSEYT 75
                           90       100
                   ....*....|....*....|...
gi 1731019548 2041 SKFWKGLQLALGTRLDFSTAFHP 2063
Cdd:pfam00665   76 SKAFREFLKDLGIKPSFSRPGNP 98
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
497-634 4.62e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 48.83  E-value: 4.62e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  497 AQPSRNRNRAVAELPAAPGESSKPICAAERPSEQPSLARRSRVEAASARPQPTRQPAesrrfdPTQAQPkrqsvaVAERS 576
Cdd:PRK07764   380 RLERRLGVAGGAGAPAAAAPSAAAAAPAAAPAPAAAAPAAAAAPAPAAAPQPAPAPA------PAPAPP------SPAGN 447
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1731019548  577 RGSRTTNSRRAARNPEAEASRAKPSRVSPVASRAASPPVASRAAREIPVPSRAASAQA 634
Cdd:PRK07764   448 APAGGAPSPPPAAAPSAQPAPAPAAAPEPTAAPAPAPPAAPAPAAAPAAPAAPAAPAG 505
RnhA COG0328
Ribonuclease HI [Replication, recombination and repair];
1686-1800 4.79e-05

Ribonuclease HI [Replication, recombination and repair];


Pssm-ID: 440097 [Multi-domain]  Cd Length: 136  Bit Score: 45.22  E-value: 4.79e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1686 VIYSDAS------KKGLGCVLMQQGKVvayasRQLKIHEQNYPTHDLELAAVVFALKIWRHyLYGEKIQIYTDHKSLKYF 1759
Cdd:COG0328      4 EIYTDGAcrgnpgPGGWGAVIRYGGEE-----KELSGGLGDTTNNRAELTALIAALEALKE-LGPCEVEIYTDSQYVVNQ 77
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 1731019548 1760 FTQKELNMRQRRW------------LELVKDYDCEILYHPGKA----NVVADALSRK 1800
Cdd:COG0328     78 ITGWIHGWKKNGWkpvknpdlwqrlDELLARHKVTFEWVKGHAghpgNERADALANK 134
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
510-653 6.93e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 48.06  E-value: 6.93e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  510 LPAAPGESSKPICAAERPSEQPSLARRSRVEAASARPQPTRQPAESrrfdPTQAQPKRQSVAVAersrgsrttnsRRAAR 589
Cdd:PRK07764   364 LPSASDDERGLLARLERLERRLGVAGGAGAPAAAAPSAAAAAPAAA----PAPAAAAPAAAAAP-----------APAAA 428
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1731019548  590 NPEAEASRAKPSRVSPVASRAASPPVASRAAREIPVPSRAASAQAEPVLSSSSRAARTKLAPST 653
Cdd:PRK07764   429 PQPAPAPAPAPAPPSPAGNAPAGGAPSPPPAAAPSAQPAPAPAAAPEPTAAPAPAPPAAPAPAA 492
RT_pepA17 cd01644
RT_pepA17: Reverse transcriptase (RTs) in retrotransposons. This subfamily represents the RT ...
1464-1611 8.39e-05

RT_pepA17: Reverse transcriptase (RTs) in retrotransposons. This subfamily represents the RT domain of a multifunctional enzyme. C-terminal to the RT domain is a domain homologous to aspartic proteinases (corresponding to Merops family A17) encoded by retrotransposons and retroviruses. RT catalyzes DNA replication from an RNA template and is responsible for the replication of retroelements.


Pssm-ID: 238822  Cd Length: 213  Bit Score: 46.14  E-value: 8.39e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1464 DLFDQL----QGATVFSKiDLRSGYHQLRIR-------------DGDIPKTAfrsrygHYEFVVMSFGLTNAPAvfmdLM 1526
Cdd:cd01644     47 SLFGVLlrfrQGKIAVSA-DIEKMFHQVKVRpedrdvlrflwrkDGDEPKPI------EYRMTVVPFGAASAPF----LA 115
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548 1527 NRVFKEFLDSF----------VIVFIDDILIYSKTEAEHEEhlhqVLETLRANKLYAKFSKCEFWLRKVTFLGHVvssEG 1596
Cdd:cd01644    116 NRALKQHAEDHpheaaakiikRNFYVDDILVSTDTLNEAVN----VAKRLIALLKKGGFNLRKWASNSQEVLDDL---PE 188
                          170
                   ....*....|....*
gi 1731019548 1597 VSVDPAKIEAVTNWT 1611
Cdd:cd01644    189 ERVLLDRDSDVTEKT 203
PRK07003 PRK07003
DNA polymerase III subunit gamma/tau;
488-634 8.56e-05

DNA polymerase III subunit gamma/tau;


Pssm-ID: 235906 [Multi-domain]  Cd Length: 830  Bit Score: 47.92  E-value: 8.56e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  488 PPSILLCSAAQPSRNRNRAVAELPAAPGESSKPIC--AAERPSEQPSLARRSRVEAASARPQPTRQPAESRRFDPTQAQP 565
Cdd:PRK07003   401 AVTGAAGAALAPKAAAAAAATRAEAPPAAPAPPATadRGDDAADGDAPVPAKANARASADSRCDERDAQPPADSGSASAP 480
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1731019548  566 KRQSVAVAERSRGSRTTNSRRAARNPEAEASR----AKPSRVSPVASRA-ASPPVASRAAREipvPSRAASAQA 634
Cdd:PRK07003   481 ASDAPPDAAFEPAPRAAAPSAATPAAVPDARApaaaSREDAPAAAAPPApEARPPTPAAAAP---AARAGGAAA 551
PRK07003 PRK07003
DNA polymerase III subunit gamma/tau;
496-652 1.67e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 235906 [Multi-domain]  Cd Length: 830  Bit Score: 47.15  E-value: 1.67e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  496 AAQPSRNRNRAVAELPAAPGESSKPICAAERPSEQPSLAR-RSRVEAASARPQP--TRQPAESRRFDPTQAQPKRQSVAV 572
Cdd:PRK07003   379 AVPAPGARAAAAVGASAVPAVTAVTGAAGAALAPKAAAAAaATRAEAPPAAPAPpaTADRGDDAADGDAPVPAKANARAS 458
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  573 AERSRGSRTTNSRRAARNPEAEASRAKPSRVSPVASRAASPPVASRAAREIPVPSRAASAQAEPVLSSSSRAARTKLAPS 652
Cdd:PRK07003   459 ADSRCDERDAQPPADSGSASAPASDAPPDAAFEPAPRAAAPSAATPAAVPDARAPAAASREDAPAAAAPPAPEARPPTPA 538
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
511-646 1.77e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 46.90  E-value: 1.77e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  511 PAAPGESSKPICAAERPSEQPSLARrSRVEAASARPQPTRQPAES--------RRFDPTQAQPKRQSVAVAERSRGSRTT 582
Cdd:PRK07764   593 GAAGGEGPPAPASSGPPEEAARPAA-PAAPAAPAAPAPAGAAAAPaeasaapaPGVAAPEHHPKHVAVPDASDGGDGWPA 671
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1731019548  583 NSRRAArnPEAEASRAKPSRVSPVASRAASPPVASRAAREIPVPSRAASAQAEPVLSSSSRAAR 646
Cdd:PRK07764   672 KAGGAA--PAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSP 733
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
488-641 1.89e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 46.79  E-value: 1.89e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  488 PPSILLCSAAQPSRNRNRAVAELPAAPgeSSKPIcAAERPSEQPSLARRSRVEAASARPQPTRQPA---------ESRRF 558
Cdd:PRK12323   428 PAPEALAAARQASARGPGGAPAPAPAP--AAAPA-AAARPAAAGPRPVAAAAAAAPARAAPAAAPApadddpppwEELPP 504
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  559 DPTQAQPKRQSVAVAERSRGSRTTNSRRAARNPEAEASRAKPSRVSPVASRAASPPVASRAAREIPVPSRAASAQAEPVL 638
Cdd:PRK12323   505 EFASPAPAQPDAAPAGWVAESIPDPATADPDDAFETLAPAPAAAPAPRAAAATEPVVAPRPPRASASGLPDMFDGDWPAL 584

                   ...
gi 1731019548  639 SSS 641
Cdd:PRK12323   585 AAR 587
flhF PRK06995
flagellar biosynthesis protein FlhF;
507-624 2.38e-04

flagellar biosynthesis protein FlhF;


Pssm-ID: 235904 [Multi-domain]  Cd Length: 484  Bit Score: 46.11  E-value: 2.38e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  507 VAELPAAPGESSKPICAAERPSEQPSLARRsrvEAASARPQPTRQPAESRrfdptqAQPKRQSVAVAERSRGSRTTNSRR 586
Cdd:PRK06995    52 APPAAAAPAAAQPPPAAAPAAVSRPAAPAA---EPAPWLVEHAKRLTAQR------EQLVARAAAPAAPEAQAPAAPAER 122
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|...
gi 1731019548  587 AARNP-EAEASRAKPS----RVSPVASRAASPPVASRAAREIP 624
Cdd:PRK06995   123 AAAENaARRLARAAAAaprpRVPADAAAAVADAVKARIERIVN 165
PRK07003 PRK07003
DNA polymerase III subunit gamma/tau;
512-671 2.70e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 235906 [Multi-domain]  Cd Length: 830  Bit Score: 46.38  E-value: 2.70e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  512 AAPGESSKPICAAERPSEQPSLARRSRVEAASARPQPTRQPAESRRfdPTQAQPKRQSVAVAERSRgsrttnSRRAARNP 591
Cdd:PRK07003   366 GAPGGGVPARVAGAVPAPGARAAAAVGASAVPAVTAVTGAAGAALA--PKAAAAAAATRAEAPPAA------PAPPATAD 437
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  592 EAEASRAKPsrvSPVASRAASP-PVASRAAREIPVPSRAASAQAEPVlSSSSRAARTKLAPSTYILRGISSVGARALQPP 670
Cdd:PRK07003   438 RGDDAADGD---APVPAKANARaSADSRCDERDAQPPADSGSASAPA-SDAPPDAAFEPAPRAAAPSAATPAAVPDARAP 513

                   .
gi 1731019548  671 S 671
Cdd:PRK07003   514 A 514
PRK12727 PRK12727
flagellar biosynthesis protein FlhF;
506-671 3.28e-04

flagellar biosynthesis protein FlhF;


Pssm-ID: 237182 [Multi-domain]  Cd Length: 559  Bit Score: 45.75  E-value: 3.28e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  506 AVAELPAAPGESSKPICAAERPSEQPSLARRSRVEAA-----SARPQptRQPAESRRFDPTQAQPKRQSVAVAERSRGSR 580
Cdd:PRK12727    68 APAPAPQAPTKPAAPVHAPLKLSANANMSQRQRVASAaedmiAAMAL--RQPVSVPRQAPAAAPVRAASIPSPAAQALAH 145
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  581 TTNSRRAARNPEAEASRakPSRVSPVASRAASPPVASRAAREIPVPSRA-ASAQAEPVLSSSSRAARTKLAPSTYILrgi 659
Cdd:PRK12727   146 AAAVRTAPRQEHALSAV--PEQLFADFLTTAPVPRAPVQAPVVAAPAPVpAIAAALAAHAAYAQDDDEQLDDDGFDL--- 220
                          170
                   ....*....|..
gi 1731019548  660 SSVGARALQPPS 671
Cdd:PRK12727   221 DDALPQILPPAA 232
PHA03381 PHA03381
tegument protein VP22; Provisional
514-653 3.70e-04

tegument protein VP22; Provisional


Pssm-ID: 177618 [Multi-domain]  Cd Length: 290  Bit Score: 45.00  E-value: 3.70e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  514 PGESSKPICAAERPSEQPSLARRSRVEAASARPQPTRQPAESRRfdPTQAQPKRQSVAVAERSRGSrttnSRRAARNPEA 593
Cdd:PHA03381    40 PADRARRGAGQARGRSQAERRFHHYDEARADYPYYTGSSSEDER--PADPRPSRRPHAQPEASGPG----PARGARGPAG 113
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1731019548  594 EASRAK----PSRVSPVASRAASPPVASRAAREIPVPSRAASAQAEPVLSSSSRAARTKLAPST 653
Cdd:PHA03381   114 SRGRGRraesPSPRDPPNPKGASAPRGRKSACADSAALLDAPAPAAPKRQKTPAGLARKLHFST 177
PRK07994 PRK07994
DNA polymerase III subunits gamma and tau; Validated
495-652 6.09e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236138 [Multi-domain]  Cd Length: 647  Bit Score: 44.86  E-value: 6.09e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  495 SAAQPSRNRNRAVAELPAAPGESSKPicAAERPSEQPSLAR-RSRVEAASARPQPTRQPAESR--RFDPTQAQPKRQSVA 571
Cdd:PRK07994   377 PAASAQATAAPTAAVAPPQAPAVPPP--PASAPQQAPAVPLpETTSQLLAARQQLQRAQGATKakKSEPAAASRARPVNS 454
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  572 VAERSRGSRTTNSRRAARNPEAEASRAKPSRVSPVASRAASPPVASRAAREipvPSRAASAQAEPVLSSSSRAARTKLAP 651
Cdd:PRK07994   455 ALERLASVRPAPSALEKAPAKKEAYRWKATNPVEVKKEPVATPKALKKALE---HEKTPELAAKLAAEAIERDPWAALVS 531

                   .
gi 1731019548  652 S 652
Cdd:PRK07994   532 Q 532
PHA03247 PHA03247
large tegument protein UL36; Provisional
487-671 6.25e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 45.31  E-value: 6.25e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  487 DPPSILLCSAAQPSRNRNRavaelPAAPGESSKPICAA----------ERPSEQPSLARRSRVEAASARPQPTRQPAESR 556
Cdd:PHA03247  2607 DPRGPAPPSPLPPDTHAPD-----PPPPSPSPAANEPDphppptvpppERPRDDPAPGRVSRPRRARRLGRAAQASSPPQ 2681
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  557 RFDPTQAQPkrqsvAVAERSRGSRTTNSRRAARNPEAEASRAKPSRVSPVASRAASPPVASRaareiPVPSRAASAQAEP 636
Cdd:PHA03247  2682 RPRRRAARP-----TVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAA-----PAPPAVPAGPATP 2751
                          170       180       190
                   ....*....|....*....|....*....|....*.
gi 1731019548  637 V-LSSSSRAARTKLAPSTYILRGISSVGARALQPPS 671
Cdd:PHA03247  2752 GgPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPA 2787
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
508-651 8.26e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 44.71  E-value: 8.26e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  508 AELPAAPGESS------------KPICAAER--PSEQPSLARRSRVEAASARPQPTRQPAESRRFDPTQAQPKRQSVAVA 573
Cdd:PRK14951   342 AELGLAPDEYAaltmvllrllafKPAAAAEAaaPAEKKTPARPEAAAPAAAPVAQAAAAPAPAAAPAAAASAPAAPPAAA 421
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  574 ErsrgsrttnSRRAARNPEAEASRAKPSRVSPVASRAASPPVASRAAREIP--VPSRAASAQAEPVLSSSSRAARTKLAP 651
Cdd:PRK14951   422 P---------PAPVAAPAAAAPAAAPAAAPAAVALAPAPPAQAAPETVAIPvrVAPEPAVASAAPAPAAAPAAARLTPTE 492
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
523-672 9.16e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 44.59  E-value: 9.16e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  523 AAERPSEQPSLARRSRVEAAsARPQPTRQPAEsrrfdptQAQPKRQSVAvAERSRGSRTTNSRRAA--RNPEAEASRAKP 600
Cdd:PRK07764   593 GAAGGEGPPAPASSGPPEEA-ARPAAPAAPAA-------PAAPAPAGAA-AAPAEASAAPAPGVAApeHHPKHVAVPDAS 663
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1731019548  601 SRVSPVASRAASPPVASRAAREIPVPSRAASAQAEPVlSSSSRAARTKLAPSTYILRGISSVGARALQPPSR 672
Cdd:PRK07764   664 DGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQ-PAPAPAATPPAGQADDPAAQPPQAAQGASAPSPA 734
PRK07003 PRK07003
DNA polymerase III subunit gamma/tau;
495-652 9.30e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 235906 [Multi-domain]  Cd Length: 830  Bit Score: 44.46  E-value: 9.30e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  495 SAAQPSRNRNRAVAELPAAPGESSKPICAAerPSEQPSLARRSRVEAASArpqPTRQPAESRRFDPTQAQPKRQSVAVAE 574
Cdd:PRK07003   372 VPARVAGAVPAPGARAAAAVGASAVPAVTA--VTGAAGAALAPKAAAAAA---ATRAEAPPAAPAPPATADRGDDAADGD 446
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1731019548  575 rSRGSRTTNSRRAARNPEAEASRAKPSRVSPVASRAASPPVASRAAREIPVPSRAASAQAEPVLSSSSRAARTKLAPS 652
Cdd:PRK07003   447 -APVPAKANARASADSRCDERDAQPPADSGSASAPASDAPPDAAFEPAPRAAAPSAATPAAVPDARAPAAASREDAPA 523
RP_DDI cd05479
RP_DDI; retropepsin-like domain of DNA damage inducible protein; The family represents the ...
1250-1291 1.07e-03

RP_DDI; retropepsin-like domain of DNA damage inducible protein; The family represents the retropepsin-like domain of DNA damage inducible protein. DNA damage inducible protein has a retropepsin-like domain and an amino-terminal ubiquitin-like domain and/or a UBA (ubiquitin-associated) domain. This CD represents the retropepsin-like domain of DDI.


Pssm-ID: 133146  Cd Length: 124  Bit Score: 41.00  E-value: 1.07e-03
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|..
gi 1731019548 1250 IKACRVEIANRMLDVTLLVLDMQDFDVILGMDWLSANHANID 1291
Cdd:cd05479     75 IHLAQVKIGNLFLPCSFTVLEDDDVDFLIGLDMLKRHQCVID 116
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
495-676 1.18e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 44.21  E-value: 1.18e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  495 SAAQPSRNRNRAVAElPAAPGESSKPI----CAAERPSEQPSLARRSRVEAASARPQPTRQPAESRRFDP-----TQAQP 565
Cdd:PRK07764   600 PPAPASSGPPEEAAR-PAAPAAPAAPAapapAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGwpakaGGAAP 678
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  566 KRQSVAVAERSRGSRTTnsrrAARNPEAEASRAKPSRVSPVASRAASPPVASRAAREIPVPSRAASAQAEPVLSSSSRAA 645
Cdd:PRK07764   679 AAPPPAPAPAAPAAPAG----AAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPPDPAGA 754
                          170       180       190
                   ....*....|....*....|....*....|.
gi 1731019548  646 RTKLAPSTYILRGISSVGARALQPPSRDFVA 676
Cdd:PRK07764   755 PAQPPPPPAPAPAAAPAAAPPPSPPSEEEEM 785
PLN03237 PLN03237
DNA topoisomerase 2; Provisional
487-656 1.64e-03

DNA topoisomerase 2; Provisional


Pssm-ID: 215641 [Multi-domain]  Cd Length: 1465  Bit Score: 44.08  E-value: 1.64e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  487 DPPSILLCSAAQPSRNRNRAVAELPAAPGESSKPICAAERPSEQPSLARRSRVEAASARPQPTRQPAESRRFDPTQAQPk 566
Cdd:PLN03237  1284 KMEETVKAVPARRAAARKKPLASVSVISDSDDDDDDFAVEVSLAERLKKKGGRKPAAANKKAAKPPAAAKKRGPATVQS- 1362
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  567 rQSVAVAERSRGSRTTNSrraarNPEaeaSRAKPSRVSPVASRAASppVASRAAREI------PVPSRAASAQAEPVLSS 640
Cdd:PLN03237  1363 -GQKLLTEMLKPAEAIGI-----SPE---KKVRKMRASPFNKKSGS--VLGRAATNKetesseNVSGSSSSEKDEIDVSA 1431
                          170
                   ....*....|....*.
gi 1731019548  641 SSRAARTKLAPSTYIL 656
Cdd:PLN03237  1432 KPRPQRANRKQTTYVL 1447
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
496-647 1.71e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 43.71  E-value: 1.71e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  496 AAQPSRNRNRAVAELPAAPGESSKPICAAERPSEQPSLARRSRVEAASARPQPTrqPAESRRFDPTQAQPKRQSVAVAER 575
Cdd:PRK12323   404 AAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAA--PAAAARPAAAGPRPVAAAAAAAPA 481
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  576 SRGSRTTNSRRAARNPEAEASRAKPSRVSPVASRAASPPVAS-----------RAAREIPVPSRAASAQAEPVLSSSSRA 644
Cdd:PRK12323   482 RAAPAAAPAPADDDPPPWEELPPEFASPAPAQPDAAPAGWVAesipdpatadpDDAFETLAPAPAAAPAPRAAAATEPVV 561

                   ...
gi 1731019548  645 ART 647
Cdd:PRK12323   562 APR 564
gag-asp_proteas pfam13975
gag-polyprotein putative aspartyl protease; This family of putative aspartyl proteases is ...
1224-1284 1.72e-03

gag-polyprotein putative aspartyl protease; This family of putative aspartyl proteases is found pre-dominantly in retroviral proteins.


Pssm-ID: 464060  Cd Length: 92  Bit Score: 39.48  E-value: 1.72e-03
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1731019548 1224 GLEVEPLGSVLSVSTPSGEVllSKEQIKACRVEIANRML-DVTLLVLDMQDFDVILGMDWLS 1284
Cdd:pfam13975   32 GLDRLVDAYPVTVRTANGTV--RAARVRLDSVKIGGIELrNVPAVVLPGDLDDVLLGMDFLK 91
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
505-670 1.90e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 43.62  E-value: 1.90e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  505 RAVAELPAAPGESSKPICAAERPSEQPSLARRSRVEAASARPQPTRQPAESRRFDPTQAQPKRQSVAVAERSRGSRTTNS 584
Cdd:PHA03307   267 TRIWEASGWNGPSSRPGPASSSSSPRERSPSPSPSSPGSGPAPSSPRASSSSSSSRESSSSSTSSSSESSRGAAVSPGPS 346
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  585 R-RAARNPEAEASRAKPSRVSPVASRAASPPVASRAAReiPVPSRAASAQAEPVLSSSSRAARTKLAPSTYILRGISSVG 663
Cdd:PHA03307   347 PsRSPSPSRPPPPADPSSPRKRPRPSRAPSSPAASAGR--PTRRRARAAVAGRARRRDATGRFPAGRPRPSPLDAGAASG 424

                   ....*..
gi 1731019548  664 ARALQPP 670
Cdd:PHA03307   425 AFYARYP 431
PRK12678 PRK12678
transcription termination factor Rho; Provisional
505-622 1.97e-03

transcription termination factor Rho; Provisional


Pssm-ID: 237171 [Multi-domain]  Cd Length: 672  Bit Score: 43.35  E-value: 1.97e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  505 RAVAELPAAPGESSKPICAAERPSEQPSLARRSRVEAASARPQPTRQPAESRRFDPTQAQPKRQSVAVAERSRGSRTTNS 584
Cdd:PRK12678    67 AATPAAPAAAARRAARAAAAARQAEQPAAEAAAAKAEAAPAARAAAAAAAEAASAPEAAQARERRERGEAARRGAARKAG 146
                           90       100       110
                   ....*....|....*....|....*....|....*...
gi 1731019548  585 RRAARNPEAEASRAKPSRVSPVASRAASPPVASRAARE 622
Cdd:PRK12678   147 EGGEQPATEARADAAERTEEEERDERRRRGDREDRQAE 184
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
495-642 2.10e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 43.62  E-value: 2.10e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  495 SAAQPSRNRNRAVAELPAAPGESSKPICAAERPSEQPSLARRSRVEAASAR---------PQPTRQPAESRRFDPTQAQP 565
Cdd:PHA03307   773 ALLEPAEPQRGAGSSPPVRAEAAFRRPGRLRRSGPAADAASRTASKRKSRShtpdggsesSGPARPPGAAARPPPARSSE 852
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1731019548  566 KRQSVAVAERSRGSrttnSRRAARNPEAEASRAKPSrvspvasrAASPPVASRAAREIPVPSRAASAQAEPVLSSSS 642
Cdd:PHA03307   853 SSKSKPAAAGGRAR----GKNGRRRPRPPEPRARPG--------AAAPPKAAAAAPPAGAPAPRPRPAPRVKLGPMP 917
PRK12678 PRK12678
transcription termination factor Rho; Provisional
496-602 2.22e-03

transcription termination factor Rho; Provisional


Pssm-ID: 237171 [Multi-domain]  Cd Length: 672  Bit Score: 43.35  E-value: 2.22e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  496 AAQPSRNRNRAVAELPAAPGESSKPICAAERPSEQPSLARRSRVEAASARPQPTRQPAESRRfdptqAQPKRQSVAVAER 575
Cdd:PRK12678    99 AAKAEAAPAARAAAAAAAEAASAPEAAQARERRERGEAARRGAARKAGEGGEQPATEARADA-----AERTEEEERDERR 173
                           90       100
                   ....*....|....*....|....*..
gi 1731019548  576 SRGSRTTNSRRAARNPEAEASRAKPSR 602
Cdd:PRK12678   174 RRGDREDRQAEAERGERGRREERGRDG 200
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
511-674 2.77e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 43.24  E-value: 2.77e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  511 PAAPGESSKPICAAERPSEQPSLARRSRVEAASARPQPTRQPAESRRFDPTQAQPKRQSVAVAERSRGSRttnsrRAARN 590
Cdd:PHA03307    78 EAPANESRSTPTWSLSTLAPASPAREGSPTPPGPSSPDPPPPTPPPASPPPSPAPDLSEMLRPVGSPGPP-----PAASP 152
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  591 PEAEASRAkpsrvsPVASRAASPPVASRAAREIPVPSRAASAQAEPVLSSSSRAARTKLAPSTYILRGISSVGARALQPP 670
Cdd:PHA03307   153 PAAGASPA------AVASDAASSRQAALPLSSPEETARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASSPAPAPGR 226

                   ....
gi 1731019548  671 SRDF 674
Cdd:PHA03307   227 SAAD 230
flhF PRK06995
flagellar biosynthesis protein FlhF;
532-644 3.46e-03

flagellar biosynthesis protein FlhF;


Pssm-ID: 235904 [Multi-domain]  Cd Length: 484  Bit Score: 42.26  E-value: 3.46e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  532 SLARRSRVEAASARPQPTRQPAESRRFDPTQAQPKRQSVAVAERSRGSRTTNSRRAARNPEAEA-SRAKPSRVSPVASRA 610
Cdd:PRK06995    50 ALAPPAAAAPAAAQPPPAAAPAAVSRPAAPAAEPAPWLVEHAKRLTAQREQLVARAAAPAAPEAqAPAAPAERAAAENAA 129
                           90       100       110
                   ....*....|....*....|....*....|....
gi 1731019548  611 ASPPVASRAAREIPVPSRAASAQAEPVLSSSSRA 644
Cdd:PRK06995   130 RRLARAAAAAPRPRVPADAAAAVADAVKARIERI 163
CD_POL_like cd18979
chromodomain of a Zea maize putative metaviridae (gypsy-type) retrotransposon polyproteins ...
2268-2315 3.58e-03

chromodomain of a Zea maize putative metaviridae (gypsy-type) retrotransposon polyproteins (Z195D10.9), and similar proteins; This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in Zea maize Z195D10.9 protein, and other putative TY3/gypsy retrotransposon polyproteins. The pol gene in TY3/gypsy elements generally encodes domains in the following order: an aspartyl protease, a reverse transcriptase, RNase H, and an integrase, here the chromodomain is found at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.


Pssm-ID: 349335  Cd Length: 48  Bit Score: 37.47  E-value: 3.58e-03
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*...
gi 1731019548 2268 QPVEVLAREVKKLRSREIpLVKilWQNHGVEEATWEKEEDMRAQYPEL 2315
Cdd:cd18979      2 FPEKVLDIRQRDKGNKEF-LVQ--WQGLSVEEATWEPYKDLVQQFPDF 46
PRK07994 PRK07994
DNA polymerase III subunits gamma and tau; Validated
503-643 3.78e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236138 [Multi-domain]  Cd Length: 647  Bit Score: 42.55  E-value: 3.78e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  503 RNRAVAELPAAPGESSKPICAAERPSEQPslarrsrveAASARPQPTRQPAEsrrfdPTQAQPKRQSVAVAErsrgsrTT 582
Cdd:PRK07994   360 HPAAPLPEPEVPPQSAAPAASAQATAAPT---------AAVAPPQAPAVPPP-----PASAPQQAPAVPLPE------TT 419
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1731019548  583 NSRRAARNP-EAEASRAKPSRVSPVASrAASPPVASRAAREIPVPSRAASAQAEPVLSSSSR 643
Cdd:PRK07994   420 SQLLAARQQlQRAQGATKAKKSEPAAA-SRARPVNSALERLASVRPAPSALEKAPAKKEAYR 480
growth_prot_Scy NF041483
polarized growth protein Scy;
505-650 4.22e-03

polarized growth protein Scy;


Pssm-ID: 469371 [Multi-domain]  Cd Length: 1293  Bit Score: 42.51  E-value: 4.22e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  505 RAVAE--LPAAPGEsskpicaAERPSEQPSlaRRSRVEAASARPQPT----RQPAESRRF---DPTQAQpkrQSVAVAER 575
Cdd:NF041483   173 RAEAEqaLAAARAE-------AERLAEEAR--QRLGSEAESARAEAEailrRARKDAERLlnaASTQAQ---EATDHAEQ 240
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1731019548  576 SRGSRTTNSRrAARNPEAEASRAKPSRVSPvasraasppvASRAAREipvpsraASAQAEPVLSSSSRAARTKLA 650
Cdd:NF041483   241 LRSSTAAESD-QARRQAAELSRAAEQRMQE----------AEEALRE-------ARAEAEKVVAEAKEAAAKQLA 297
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
494-636 5.12e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 42.08  E-value: 5.12e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  494 CSAAQPSR-------NRNRAVAEL-PAAPGESSKPICAAERPSEQPSlaRRSRVEAASARPQPTRQPAESRRfdPTQAQP 565
Cdd:PHA03307   295 SPSPSPSSpgsgpapSSPRASSSSsSSRESSSSSTSSSSESSRGAAV--SPGPSPSRSPSPSRPPPPADPSS--PRKRPR 370
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1731019548  566 KRQSVAVAERSRGSRTtnSRRAAR-NPEAEASRAKPSRvSPVASRAASPPVASRAAREIPVPSRAASAQAEP 636
Cdd:PHA03307   371 PSRAPSSPAASAGRPT--RRRARAaVAGRARRRDATGR-FPAGRPRPSPLDAGAASGAFYARYPLLTPSGEP 439
SepH NF040712
septation protein SepH; Septation protein H (SepH) was firstly characterized in Streptomyces ...
496-627 9.18e-03

septation protein SepH; Septation protein H (SepH) was firstly characterized in Streptomyces venezuelae, and homologs were identified in Mycobacterium smegmatis. SepH contains a N-terminal DUF3071 domain and a conserved C-terminal region. It binds directly to cell division protein FtsZ to stimulate the assembly of FtsZ protofilaments.


Pssm-ID: 468676 [Multi-domain]  Cd Length: 346  Bit Score: 40.91  E-value: 9.18e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1731019548  496 AAQPSRNRNRAVAELPAAPGEsskpicAAERPSEQPSLARRSRVEAASARPQPTRQPAESRRFDPTQAQPKRQSVAVAER 575
Cdd:NF040712   206 AREPADARPEEVEPAPAAEGA------PATDSDPAEAGTPDDLASARRRRAGVEQPEDEPVGPGAAPAAEPDEATRDAGE 279
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 1731019548  576 SRGSRTTNSRRAARNPE----AEASRAKPSRVSPVASRAASPPVASRAAREIPVPS 627
Cdd:NF040712   280 PPAPGAAETPEAAEPPApapaAPAAPAAPEAEEPARPEPPPAPKPKRRRRRASVPS 335
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH