NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1776518020|gb|KAE8655616|]
View 

hypothetical protein F3Y22_tig00117021pilonHSYRG00028 [Hibiscus syriacus]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
RNase_HI_RT_Ty1 cd09272
Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1093-1230 1.58e-72

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


:

Pssm-ID: 260004  Cd Length: 140  Bit Score: 237.37  E-value: 1.58e-72
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020 1093 VGYCDADYAGDHDTRRSTTGYVFKLGSGTISWCSKRQPTVSLSTTEAEYRAAAMAAQESTWLIQLMNNLHQPVDYAIPLY 1172
Cdd:cd09272      1 EGYSDADWAGDPDDRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELGIPLDGPTTIY 80
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1776518020 1173 CDNQSAIRLAENPVFHARTKHVEVHYHFVREKVLQEEIEMRQIKTDEQIADLFTKSLS 1230
Cdd:cd09272     81 CDNQSAIALAKNPVFHSRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLP 138
RVT_2 super family cl06662
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
822-1036 3.01e-66

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model.


The actual alignment was detected with superfamily member pfam07727:

Pssm-ID: 400190  Cd Length: 243  Bit Score: 224.01  E-value: 3.01e-66
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  822 NQTWDIVPKIKDVKPISCKWVYKIKRRpDGSIERYKARLVARGFSQQYGLDYDETFSPVAKLTTVRVLLALAANKDWNLW 901
Cdd:pfam07727    1 NETWTLVKLPKNVKPIGTTWVHTHKIN-DLKEVQYKARLVAQGFRQIAGEDYDKVFSPVIRLSSVRLLLAIAAEYEWPVH 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  902 QMDVKNAFLHGELDREIYMTQPMGFQSQDHPEYVCKLRKALYGLKQAPRAWYGKIAEFLTKSGYSVTPADSSLFVKANEE 981
Cdd:pfam07727   80 HMDVSSAFLNGDIDEEIYVKQPPGFNIDNESGKVWQLNKSLYGLKQAPYMWNTCITKVLMDLNFEPDTAESGMYCRGFGE 159
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1776518020  982 ---------------------ILQTKENLSVRFQMKELGQLKHFLGLEVDRTHEGIFLCQQKYAKDLLKRFGMLDA 1036
Cdd:pfam07727  160 nklivglyvddmfitgsditiINDFKLELAKHFKMKDLGDISEFLGIEFIQIAGGIRLSQHNYLNSVIKKFNLTNN 235
transpos_IS481 super family cl41329
IS481 family transposase; null
507-606 2.94e-14

IS481 family transposase; null


The actual alignment was detected with superfamily member NF033577:

Pssm-ID: 468094 [Multi-domain]  Cd Length: 283  Bit Score: 74.55  E-value: 2.94e-14
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  507 KFKAKEPLELVHSDVFGpVKQQSISGMRYMVTFIDDFSR------------DSAEG-------EVGKKICCLRTDNGGEY 567
Cdd:NF033577   121 RYERAHPGELWHIDIKK-LGRIPDVGRLYLHTAIDDHSRfayaelypdetaETAADflrrafaEHGIPIRRVLTDNGSEF 199
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|.
gi 1776518020  568 RSN--EFSQYLRECRIRHQYTCANTPQQNGVAERKNRHLAE 606
Cdd:NF033577   200 RSRahGFELALAELGIEHRRTRPYHPQTNGKVERFHRTLKD 240
gag_pre-integrs pfam13976
GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements ...
447-497 3.89e-14

GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements and lies just upstream of the integrase region on the polyproteins.


:

Pssm-ID: 372857  Cd Length: 67  Bit Score: 68.16  E-value: 3.89e-14
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1776518020  447 RKNETSDLWHMRLGHVSYSKLSVMVKKSMLKGLPQLDvrtDTVCAGCQYGK 497
Cdd:pfam13976   19 SKDDETWLWHRRLGHPSFKGLKKLVKKGLLPGLPISK---DLVCESCQLGK 66
DUF4219 pfam13961
Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal ...
42-67 3.13e-09

Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal of many Gag-pol polyprotein and related proteins. There is a highly conserved YxxWxxxM sequence motif.


:

Pssm-ID: 433608  Cd Length: 27  Bit Score: 53.28  E-value: 3.13e-09
                           10        20
                   ....*....|....*....|....*.
gi 1776518020   42 LNNKNYNTWATCMESYLQGQDLWEVV 67
Cdd:pfam13961    1 LDGDNYETWKLRMKLYLQAQDLWEVV 26
zf-CCHC pfam00098
Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following ...
252-267 2.96e-05

Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger.


:

Pssm-ID: 395050 [Multi-domain]  Cd Length: 18  Bit Score: 41.74  E-value: 2.96e-05
                           10
                   ....*....|....*.
gi 1776518020  252 GKCYNCGKMGHMAKDC 267
Cdd:pfam00098    1 GKCYNCGEPGHIARDC 16
PTZ00368 super family cl31762
universal minicircle sequence binding protein (UMSBP); Provisional
254-283 6.75e-04

universal minicircle sequence binding protein (UMSBP); Provisional


The actual alignment was detected with superfamily member PTZ00368:

Pssm-ID: 173561 [Multi-domain]  Cd Length: 148  Bit Score: 41.33  E-value: 6.75e-04
                           10        20        30
                   ....*....|....*....|....*....|
gi 1776518020  254 CYNCGKMGHMAKDCWTKKKPVESNTATSCS 283
Cdd:PTZ00368    55 CYNCGKTGHLSRECPEAPPGSGPRSCYNCG 84
PTZ00368 super family cl31762
universal minicircle sequence binding protein (UMSBP); Provisional
196-267 1.57e-03

universal minicircle sequence binding protein (UMSBP); Provisional


The actual alignment was detected with superfamily member PTZ00368:

Pssm-ID: 173561 [Multi-domain]  Cd Length: 148  Bit Score: 40.56  E-value: 1.57e-03
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1776518020  196 KGEEEALYTSKSRGTFQRYTGNGSKKDGDKVKNYQGKGGPHSGGASKNRGNSRKFDGKCYNCGKMGHMAKDC 267
Cdd:PTZ00368    74 GSGPRSCYNCGQTGHISRECPNRAKGGAARRACYNCGGEGHISRDCPNAGKRPGGDKTCYNCGQTGHLSRDC 145
 
Name Accession Description Interval E-value
RNase_HI_RT_Ty1 cd09272
Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1093-1230 1.58e-72

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260004  Cd Length: 140  Bit Score: 237.37  E-value: 1.58e-72
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020 1093 VGYCDADYAGDHDTRRSTTGYVFKLGSGTISWCSKRQPTVSLSTTEAEYRAAAMAAQESTWLIQLMNNLHQPVDYAIPLY 1172
Cdd:cd09272      1 EGYSDADWAGDPDDRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELGIPLDGPTTIY 80
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1776518020 1173 CDNQSAIRLAENPVFHARTKHVEVHYHFVREKVLQEEIEMRQIKTDEQIADLFTKSLS 1230
Cdd:cd09272     81 CDNQSAIALAKNPVFHSRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLP 138
RVT_2 pfam07727
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
822-1036 3.01e-66

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model.


Pssm-ID: 400190  Cd Length: 243  Bit Score: 224.01  E-value: 3.01e-66
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  822 NQTWDIVPKIKDVKPISCKWVYKIKRRpDGSIERYKARLVARGFSQQYGLDYDETFSPVAKLTTVRVLLALAANKDWNLW 901
Cdd:pfam07727    1 NETWTLVKLPKNVKPIGTTWVHTHKIN-DLKEVQYKARLVAQGFRQIAGEDYDKVFSPVIRLSSVRLLLAIAAEYEWPVH 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  902 QMDVKNAFLHGELDREIYMTQPMGFQSQDHPEYVCKLRKALYGLKQAPRAWYGKIAEFLTKSGYSVTPADSSLFVKANEE 981
Cdd:pfam07727   80 HMDVSSAFLNGDIDEEIYVKQPPGFNIDNESGKVWQLNKSLYGLKQAPYMWNTCITKVLMDLNFEPDTAESGMYCRGFGE 159
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1776518020  982 ---------------------ILQTKENLSVRFQMKELGQLKHFLGLEVDRTHEGIFLCQQKYAKDLLKRFGMLDA 1036
Cdd:pfam07727  160 nklivglyvddmfitgsditiINDFKLELAKHFKMKDLGDISEFLGIEFIQIAGGIRLSQHNYLNSVIKKFNLTNN 235
transpos_IS481 NF033577
IS481 family transposase; null
507-606 2.94e-14

IS481 family transposase; null


Pssm-ID: 468094 [Multi-domain]  Cd Length: 283  Bit Score: 74.55  E-value: 2.94e-14
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  507 KFKAKEPLELVHSDVFGpVKQQSISGMRYMVTFIDDFSR------------DSAEG-------EVGKKICCLRTDNGGEY 567
Cdd:NF033577   121 RYERAHPGELWHIDIKK-LGRIPDVGRLYLHTAIDDHSRfayaelypdetaETAADflrrafaEHGIPIRRVLTDNGSEF 199
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|.
gi 1776518020  568 RSN--EFSQYLRECRIRHQYTCANTPQQNGVAERKNRHLAE 606
Cdd:NF033577   200 RSRahGFELALAELGIEHRRTRPYHPQTNGKVERFHRTLKD 240
gag_pre-integrs pfam13976
GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements ...
447-497 3.89e-14

GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements and lies just upstream of the integrase region on the polyproteins.


Pssm-ID: 372857  Cd Length: 67  Bit Score: 68.16  E-value: 3.89e-14
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1776518020  447 RKNETSDLWHMRLGHVSYSKLSVMVKKSMLKGLPQLDvrtDTVCAGCQYGK 497
Cdd:pfam13976   19 SKDDETWLWHRRLGHPSFKGLKKLVKKGLLPGLPISK---DLVCESCQLGK 66
Tra5 COG2801
Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];
508-604 9.01e-10

Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];


Pssm-ID: 442053 [Multi-domain]  Cd Length: 309  Bit Score: 61.71  E-value: 9.01e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  508 FKAKEPLELVHSDV-FGPVKQqsisGMRYMVTFIDDFSR--------------------DSA---EGEVGKKIccLRTDN 563
Cdd:COG2801    143 FTATAPNQVWVTDItYIPTAE----GWLYLAAVIDLFSReivgwsvsdsmdaelvvdalEMAierRGPPKPLI--LHSDN 216
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|.
gi 1776518020  564 GGEYRSNEFSQYLRECRIRHQYTCANTPQQNGVAERKNRHL 604
Cdd:COG2801    217 GSQYTSKAYQELLKKLGITQSMSRPGNPQDNAFIESFFGTL 257
DUF4219 pfam13961
Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal ...
42-67 3.13e-09

Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal of many Gag-pol polyprotein and related proteins. There is a highly conserved YxxWxxxM sequence motif.


Pssm-ID: 433608  Cd Length: 27  Bit Score: 53.28  E-value: 3.13e-09
                           10        20
                   ....*....|....*....|....*.
gi 1776518020   42 LNNKNYNTWATCMESYLQGQDLWEVV 67
Cdd:pfam13961    1 LDGDNYETWKLRMKLYLQAQDLWEVV 26
rve pfam00665
Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...
513-591 2.23e-05

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.


Pssm-ID: 459897 [Multi-domain]  Cd Length: 98  Bit Score: 44.23  E-value: 2.23e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  513 PLELVHSDVFgPVKQQSISGMRYMVTFIDDFSR----------DSAE-------------GEVGKKIcclRTDNGGEYRS 569
Cdd:pfam00665    1 PNQLWQGDFT-YIRIPGGGGKLYLLVIVDDFSReilawalsseMDAElvldaleraiafrGGVPLII---HSDNGSEYTS 76
                           90       100
                   ....*....|....*....|..
gi 1776518020  570 NEFSQYLRECRIRHQYTCANTP 591
Cdd:pfam00665   77 KAFREFLKDLGIKPSFSRPGNP 98
zf-CCHC pfam00098
Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following ...
252-267 2.96e-05

Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger.


Pssm-ID: 395050 [Multi-domain]  Cd Length: 18  Bit Score: 41.74  E-value: 2.96e-05
                           10
                   ....*....|....*.
gi 1776518020  252 GKCYNCGKMGHMAKDC 267
Cdd:pfam00098    1 GKCYNCGEPGHIARDC 16
transpos_IS3 NF033516
IS3 family transposase;
535-615 3.43e-05

IS3 family transposase;


Pssm-ID: 468052 [Multi-domain]  Cd Length: 369  Bit Score: 47.56  E-value: 3.43e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  535 YMVTFIDDFSR---------------------DSAEGEVGKKICCLRTDNGGEYRSNEFSQYLRECRIRHQYTCANTPQQ 593
Cdd:NF033516   234 YLAVVLDLFSReivgwsvstsmsaelvldaleMAIEWRGKPEGLILHSDNGSQYTSKAYREWLKEHGITQSMSRPGNCWD 313
                           90       100
                   ....*....|....*....|....*...
gi 1776518020  594 NGVAERKNRHL------AEICRSMLHAK 615
Cdd:NF033516   314 NAVAESFFGTLkreclyRRRFRTLEEAR 341
ZnF_C2HC smart00343
zinc finger;
253-268 5.40e-05

zinc finger;


Pssm-ID: 197667 [Multi-domain]  Cd Length: 17  Bit Score: 41.27  E-value: 5.40e-05
                            10
                    ....*....|....*.
gi 1776518020   253 KCYNCGKMGHMAKDCW 268
Cdd:smart00343    1 KCYNCGKEGHIARDCP 16
PTZ00368 PTZ00368
universal minicircle sequence binding protein (UMSBP); Provisional
254-283 6.75e-04

universal minicircle sequence binding protein (UMSBP); Provisional


Pssm-ID: 173561 [Multi-domain]  Cd Length: 148  Bit Score: 41.33  E-value: 6.75e-04
                           10        20        30
                   ....*....|....*....|....*....|
gi 1776518020  254 CYNCGKMGHMAKDCWTKKKPVESNTATSCS 283
Cdd:PTZ00368    55 CYNCGKTGHLSRECPEAPPGSGPRSCYNCG 84
PTZ00368 PTZ00368
universal minicircle sequence binding protein (UMSBP); Provisional
196-267 1.57e-03

universal minicircle sequence binding protein (UMSBP); Provisional


Pssm-ID: 173561 [Multi-domain]  Cd Length: 148  Bit Score: 40.56  E-value: 1.57e-03
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1776518020  196 KGEEEALYTSKSRGTFQRYTGNGSKKDGDKVKNYQGKGGPHSGGASKNRGNSRKFDGKCYNCGKMGHMAKDC 267
Cdd:PTZ00368    74 GSGPRSCYNCGQTGHISRECPNRAKGGAARRACYNCGGEGHISRDCPNAGKRPGGDKTCYNCGQTGHLSRDC 145
PHA02517 PHA02517
putative transposase OrfB; Reviewed
503-648 2.74e-03

putative transposase OrfB; Reviewed


Pssm-ID: 222853 [Multi-domain]  Cd Length: 277  Bit Score: 41.39  E-value: 2.74e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  503 YDESKFKAKEPLELVHSDVfgpVKQQSISGMRYMVTFIDDFSR-----------DS------------AEGEVGKKIccL 559
Cdd:PHA02517    99 RVNRQFVATRPNQLWVADF---TYVSTWQGWVYVAFIIDVFARrivgwrvsssmDTdfvldaleqalwARGRPGGLI--H 173
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  560 RTDNGGEYRSNEFSQYLRECRIRHQYTCANTPQQNGVAERKNRHLAEICrsmLHAKNVSGRFWAEAMRT--AAFVINRLP 637
Cdd:PHA02517   174 HSDKGSQYVSLAYTQRLKEAGIRASTGSRGDSYDNAPAESINGLYKAEV---IHRVSWKNREEVELATLewVAWYNNRRL 250
                          170
                   ....*....|.
gi 1776518020  638 QPRLGFVSPFE 648
Cdd:PHA02517   251 HERLGYTPPAE 261
 
Name Accession Description Interval E-value
RNase_HI_RT_Ty1 cd09272
Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1093-1230 1.58e-72

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260004  Cd Length: 140  Bit Score: 237.37  E-value: 1.58e-72
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020 1093 VGYCDADYAGDHDTRRSTTGYVFKLGSGTISWCSKRQPTVSLSTTEAEYRAAAMAAQESTWLIQLMNNLHQPVDYAIPLY 1172
Cdd:cd09272      1 EGYSDADWAGDPDDRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELGIPLDGPTTIY 80
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1776518020 1173 CDNQSAIRLAENPVFHARTKHVEVHYHFVREKVLQEEIEMRQIKTDEQIADLFTKSLS 1230
Cdd:cd09272     81 CDNQSAIALAKNPVFHSRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLP 138
RVT_2 pfam07727
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
822-1036 3.01e-66

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model.


Pssm-ID: 400190  Cd Length: 243  Bit Score: 224.01  E-value: 3.01e-66
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  822 NQTWDIVPKIKDVKPISCKWVYKIKRRpDGSIERYKARLVARGFSQQYGLDYDETFSPVAKLTTVRVLLALAANKDWNLW 901
Cdd:pfam07727    1 NETWTLVKLPKNVKPIGTTWVHTHKIN-DLKEVQYKARLVAQGFRQIAGEDYDKVFSPVIRLSSVRLLLAIAAEYEWPVH 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  902 QMDVKNAFLHGELDREIYMTQPMGFQSQDHPEYVCKLRKALYGLKQAPRAWYGKIAEFLTKSGYSVTPADSSLFVKANEE 981
Cdd:pfam07727   80 HMDVSSAFLNGDIDEEIYVKQPPGFNIDNESGKVWQLNKSLYGLKQAPYMWNTCITKVLMDLNFEPDTAESGMYCRGFGE 159
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1776518020  982 ---------------------ILQTKENLSVRFQMKELGQLKHFLGLEVDRTHEGIFLCQQKYAKDLLKRFGMLDA 1036
Cdd:pfam07727  160 nklivglyvddmfitgsditiINDFKLELAKHFKMKDLGDISEFLGIEFIQIAGGIRLSQHNYLNSVIKKFNLTNN 235
transpos_IS481 NF033577
IS481 family transposase; null
507-606 2.94e-14

IS481 family transposase; null


Pssm-ID: 468094 [Multi-domain]  Cd Length: 283  Bit Score: 74.55  E-value: 2.94e-14
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  507 KFKAKEPLELVHSDVFGpVKQQSISGMRYMVTFIDDFSR------------DSAEG-------EVGKKICCLRTDNGGEY 567
Cdd:NF033577   121 RYERAHPGELWHIDIKK-LGRIPDVGRLYLHTAIDDHSRfayaelypdetaETAADflrrafaEHGIPIRRVLTDNGSEF 199
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|.
gi 1776518020  568 RSN--EFSQYLRECRIRHQYTCANTPQQNGVAERKNRHLAE 606
Cdd:NF033577   200 RSRahGFELALAELGIEHRRTRPYHPQTNGKVERFHRTLKD 240
gag_pre-integrs pfam13976
GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements ...
447-497 3.89e-14

GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements and lies just upstream of the integrase region on the polyproteins.


Pssm-ID: 372857  Cd Length: 67  Bit Score: 68.16  E-value: 3.89e-14
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1776518020  447 RKNETSDLWHMRLGHVSYSKLSVMVKKSMLKGLPQLDvrtDTVCAGCQYGK 497
Cdd:pfam13976   19 SKDDETWLWHRRLGHPSFKGLKKLVKKGLLPGLPISK---DLVCESCQLGK 66
Tra5 COG2801
Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];
508-604 9.01e-10

Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];


Pssm-ID: 442053 [Multi-domain]  Cd Length: 309  Bit Score: 61.71  E-value: 9.01e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  508 FKAKEPLELVHSDV-FGPVKQqsisGMRYMVTFIDDFSR--------------------DSA---EGEVGKKIccLRTDN 563
Cdd:COG2801    143 FTATAPNQVWVTDItYIPTAE----GWLYLAAVIDLFSReivgwsvsdsmdaelvvdalEMAierRGPPKPLI--LHSDN 216
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|.
gi 1776518020  564 GGEYRSNEFSQYLRECRIRHQYTCANTPQQNGVAERKNRHL 604
Cdd:COG2801    217 GSQYTSKAYQELLKKLGITQSMSRPGNPQDNAFIESFFGTL 257
DUF4219 pfam13961
Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal ...
42-67 3.13e-09

Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal of many Gag-pol polyprotein and related proteins. There is a highly conserved YxxWxxxM sequence motif.


Pssm-ID: 433608  Cd Length: 27  Bit Score: 53.28  E-value: 3.13e-09
                           10        20
                   ....*....|....*....|....*.
gi 1776518020   42 LNNKNYNTWATCMESYLQGQDLWEVV 67
Cdd:pfam13961    1 LDGDNYETWKLRMKLYLQAQDLWEVV 26
rve pfam00665
Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...
513-591 2.23e-05

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.


Pssm-ID: 459897 [Multi-domain]  Cd Length: 98  Bit Score: 44.23  E-value: 2.23e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  513 PLELVHSDVFgPVKQQSISGMRYMVTFIDDFSR----------DSAE-------------GEVGKKIcclRTDNGGEYRS 569
Cdd:pfam00665    1 PNQLWQGDFT-YIRIPGGGGKLYLLVIVDDFSReilawalsseMDAElvldaleraiafrGGVPLII---HSDNGSEYTS 76
                           90       100
                   ....*....|....*....|..
gi 1776518020  570 NEFSQYLRECRIRHQYTCANTP 591
Cdd:pfam00665   77 KAFREFLKDLGIKPSFSRPGNP 98
zf-CCHC pfam00098
Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following ...
252-267 2.96e-05

Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger.


Pssm-ID: 395050 [Multi-domain]  Cd Length: 18  Bit Score: 41.74  E-value: 2.96e-05
                           10
                   ....*....|....*.
gi 1776518020  252 GKCYNCGKMGHMAKDC 267
Cdd:pfam00098    1 GKCYNCGEPGHIARDC 16
transpos_IS3 NF033516
IS3 family transposase;
535-615 3.43e-05

IS3 family transposase;


Pssm-ID: 468052 [Multi-domain]  Cd Length: 369  Bit Score: 47.56  E-value: 3.43e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  535 YMVTFIDDFSR---------------------DSAEGEVGKKICCLRTDNGGEYRSNEFSQYLRECRIRHQYTCANTPQQ 593
Cdd:NF033516   234 YLAVVLDLFSReivgwsvstsmsaelvldaleMAIEWRGKPEGLILHSDNGSQYTSKAYREWLKEHGITQSMSRPGNCWD 313
                           90       100
                   ....*....|....*....|....*...
gi 1776518020  594 NGVAERKNRHL------AEICRSMLHAK 615
Cdd:NF033516   314 NAVAESFFGTLkreclyRRRFRTLEEAR 341
ZnF_C2HC smart00343
zinc finger;
253-268 5.40e-05

zinc finger;


Pssm-ID: 197667 [Multi-domain]  Cd Length: 17  Bit Score: 41.27  E-value: 5.40e-05
                            10
                    ....*....|....*.
gi 1776518020   253 KCYNCGKMGHMAKDCW 268
Cdd:smart00343    1 KCYNCGKEGHIARDCP 16
PTZ00368 PTZ00368
universal minicircle sequence binding protein (UMSBP); Provisional
254-283 6.75e-04

universal minicircle sequence binding protein (UMSBP); Provisional


Pssm-ID: 173561 [Multi-domain]  Cd Length: 148  Bit Score: 41.33  E-value: 6.75e-04
                           10        20        30
                   ....*....|....*....|....*....|
gi 1776518020  254 CYNCGKMGHMAKDCWTKKKPVESNTATSCS 283
Cdd:PTZ00368    55 CYNCGKTGHLSRECPEAPPGSGPRSCYNCG 84
PTZ00368 PTZ00368
universal minicircle sequence binding protein (UMSBP); Provisional
196-267 1.57e-03

universal minicircle sequence binding protein (UMSBP); Provisional


Pssm-ID: 173561 [Multi-domain]  Cd Length: 148  Bit Score: 40.56  E-value: 1.57e-03
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1776518020  196 KGEEEALYTSKSRGTFQRYTGNGSKKDGDKVKNYQGKGGPHSGGASKNRGNSRKFDGKCYNCGKMGHMAKDC 267
Cdd:PTZ00368    74 GSGPRSCYNCGQTGHISRECPNRAKGGAARRACYNCGGEGHISRDCPNAGKRPGGDKTCYNCGQTGHLSRDC 145
PHA02517 PHA02517
putative transposase OrfB; Reviewed
503-648 2.74e-03

putative transposase OrfB; Reviewed


Pssm-ID: 222853 [Multi-domain]  Cd Length: 277  Bit Score: 41.39  E-value: 2.74e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  503 YDESKFKAKEPLELVHSDVfgpVKQQSISGMRYMVTFIDDFSR-----------DS------------AEGEVGKKIccL 559
Cdd:PHA02517    99 RVNRQFVATRPNQLWVADF---TYVSTWQGWVYVAFIIDVFARrivgwrvsssmDTdfvldaleqalwARGRPGGLI--H 173
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1776518020  560 RTDNGGEYRSNEFSQYLRECRIRHQYTCANTPQQNGVAERKNRHLAEICrsmLHAKNVSGRFWAEAMRT--AAFVINRLP 637
Cdd:PHA02517   174 HSDKGSQYVSLAYTQRLKEAGIRASTGSRGDSYDNAPAESINGLYKAEV---IHRVSWKNREEVELATLewVAWYNNRRL 250
                          170
                   ....*....|.
gi 1776518020  638 QPRLGFVSPFE 648
Cdd:PHA02517   251 HERLGYTPPAE 261
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH