NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|32487722|emb|CAE05399|]
View 

OSJNBa0022F16.23 [Oryza sativa Japonica Group]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
RNase_HI_RT_Ty1 cd09272
Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1181-1319 5.61e-67

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


:

Pssm-ID: 260004  Cd Length: 140  Bit Score: 222.34  E-value: 5.61e-67
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722 1181 VGYCDADMAGNVDTRKSTSGVVFFLGANPVSWQSIKQKVVALSSCEAEYIVAMTAACQGIWLAQLLREIEQEEPQSFKLL 1260
Cdd:cd09272    1 EGYSDADWAGDPDDRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELGIPLDGPTTIY 80
                         90       100       110       120       130
                 ....*....|....*....|....*....|....*....|....*....|....*....
gi 32487722 1261 VDNKSAITLSKNPVFHDRSKHIATCYHFIHECVEDGRAQVEFIGTDGKVADILTKALGR 1319
Cdd:cd09272   81 CDNQSAIALAKNPVFHSRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLPR 139
RVT_2 super family cl06662
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
971-1096 2.46e-37

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model.


The actual alignment was detected with superfamily member pfam07727:

Pssm-ID: 400190  Cd Length: 243  Bit Score: 141.19  E-value: 2.46e-37
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722    971 GLRQATsrarHAWNAKLDTSLLSLGFRRNDCEHTVYGRGSGDSRLLVGVYVDDLIITGNVVMEIDRFKAEMMSLFKMSDL 1050
Cdd:pfam07727  122 GLKQAP----YMWNTCITKVLMDLNFEPDTAESGMYCRGFGENKLIVGLYVDDMFITGSDITIINDFKLELAKHFKMKDL 197
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*.
gi 32487722   1051 GPLSFYLGIEVEQVADGVRLLQKLYAQCILERAGMQGCNSCSTPME 1096
Cdd:pfam07727  198 GDISEFLGIEFIQIAGGIRLSQHNYLNSVIKKFNLTNNNGKYTPII 243
ps-ssRNAv_RdRp-like super family cl40470
conserved catalytic core domain of RNA-dependent RNA polymerase (RdRp) from the positive-sense ...
1534-1627 9.22e-19

conserved catalytic core domain of RNA-dependent RNA polymerase (RdRp) from the positive-sense single-stranded RNA [(+)ssRNA] viruses and closely related viruses; This family contains the catalytic core domain of RdRp of RNA viruses which belong to Group IV of the Baltimore classification system, and are a group of related viruses that have positive-sense (+), single-stranded (ss) genomes made of ribonucleic acid (RNA). RdRp (also known as RNA replicase) catalyzes the replication of RNA from an RNA template; specifically, it catalyzes the synthesis of the RNA strand complementary to a given RNA template. The Baltimore Classification is divided into 7 classes, 3 of which include RNA viruses: Group IV (+) RNA viruses, Group III double-stranded (ds) RNA viruses, and Group V negative-sense (-) RNA viruses. Baltimore groups of viruses differ with respect to the nature of their genome (i.e., the nucleic acid form that is packaged into virions) and correspond to distinct strategies of genome replication and expression. (+) viral RNA is similar to mRNA and thus can be immediately translated by the host cell. (+)ssRNA viruses can also produce (+) copies of the genome from (-) strands of an intermediate dsRNA genome. This acts as both a transcription and a replication process since the replicated RNA is also mRNA. RdRps belong to the expansive class of polymerases containing so-called palm catalytic domains along with the accessory fingers and thumb domains. All RdRps also have six conserved structural motifs (A-F), located in its majority in the palm subdomain (A-E motifs) and the F motif is located on the finger subdomain. All these motifs have been shown to be implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides. In addition to Group IV viruses, this model also includes Picobirnaviruses (PBVs), members of the family Picobirnaviridae of dsRNA viruses (Baltimore classification Group III), which are bi-segmented dsRNA viruses. The phylogenetic tree of the RdRps of RNA viruses (realm Riboviria) showed that picobirnaviruses are embedded in the branch of diverse (+)RNA viruses; sometimes they are collectively referred to as the picornavirus supergroup. RdRps of members of the family Permutatetraviridae, a distinct group of RNA viruses that encompass a circular permutation within the RdRp palm domain, are not included in this model.


The actual alignment was detected with superfamily member cd01650:

Pssm-ID: 477363 [Multi-domain]  Cd Length: 220  Bit Score: 86.96  E-value: 9.22e-19
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722 1534 AFICLL-KKEDASGAEHYHPISLIHSFSKTISKLMANRLAPRLCELVSPNQSAFIRKRDIRDNFLYVQNMVQILHRTKKQ 1612
Cdd:cd01650    2 ARIILIpKKGKPSDPKNYRPISLLSVLYKLLEKILANRLRPVLEENILPNQFGFRPGRSTTDAILLLREVIEKAKEKKKS 81
                         90
                 ....*....|....*
gi 32487722 1613 SLFIKVDIAKAFDTV 1627
Cdd:cd01650   82 LVLVFLDFEKAFDSV 96
gag_pre-integrs pfam13976
GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements ...
448-515 4.88e-17

GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements and lies just upstream of the integrase region on the polyproteins.


:

Pssm-ID: 372857  Cd Length: 67  Bit Score: 76.64  E-value: 4.88e-17
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 32487722    448 RLYLIKI-TVARPVCLSARTQEAAWLWHARFGHLHFDALRRLTQQDMVRGLPQIDqvEQLCDCCVISKQ 515
Cdd:pfam13976    1 GLYLLDLsSVANSSIAVASKDDETWLWHRRLGHPSFKGLKKLVKKGLLPGLPISK--DLVCESCQLGKQ 67
transpos_IS481 super family cl41329
IS481 family transposase; null
533-645 4.88e-11

IS481 family transposase; null


The actual alignment was detected with superfamily member NF033577:

Pssm-ID: 468094 [Multi-domain]  Cd Length: 283  Bit Score: 65.30  E-value: 4.88e-11
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722   533 ELIHGDLCGpiSPATPGGKKHFLL-LVDDASCYMWLTLLQNKGEAAAAikHFQARSEAESGCHLKLLHPDNGGEFTS--A 609
Cdd:NF033577  129 ELWHIDIKK--LGRIPDVGRLYLHtAIDDHSRFAYAELYPDETAETAA--DFLRRAFAEHGIPIRRVLTDNGSEFRSraH 204
                          90       100       110
                  ....*....|....*....|....*....|....*.
gi 32487722   610 EFASYCTESGVKRQLTPPYSPQQNGVVERRNQTILA 645
Cdd:NF033577  205 GFELALAELGIEHRRTRPYHPQTNGKVERFHRTLKD 240
Retrotran_gag_2 super family cl26047
gag-polypeptide of LTR copia-type; This family is found in Plants and fungi, and contains ...
99-181 2.91e-07

gag-polypeptide of LTR copia-type; This family is found in Plants and fungi, and contains LTR-polyproteins, or retrotransposons of the copia-type.


The actual alignment was detected with superfamily member pfam14223:

Pssm-ID: 464108  Cd Length: 130  Bit Score: 51.09  E-value: 2.91e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722     99 VPADLLPVLSVKETAKDAWDAIKTMRVGVDRVRKSkaqELRKQFNAIEFKEGESVEEFSVRLSGLVNNLAVLGIQLEESK 178
Cdd:pfam14223    8 LSDSLLRLVRNADTAKEAWDKLESTYERKSPANKL---TLRRQLHSLKMKEGESVLEHINKFEELVNKLSALGVEISDED 84

                   ...
gi 32487722    179 KLT 181
Cdd:pfam14223   85 LVV 87
DUF4219 pfam13961
Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal ...
53-78 1.48e-06

Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal of many Gag-pol polyprotein and related proteins. There is a highly conserved YxxWxxxM sequence motif.


:

Pssm-ID: 433608  Cd Length: 27  Bit Score: 45.96  E-value: 1.48e-06
                           10        20
                   ....*....|....*....|....*.
gi 32487722     53 LTKTNYSDWSLLMRAMLQVRGLWEAV 78
Cdd:pfam13961    1 LDGDNYETWKLRMKLYLQAQDLWEVV 26
zf-CCHC pfam00098
Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following ...
260-275 5.47e-03

Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger.


:

Pssm-ID: 395050 [Multi-domain]  Cd Length: 18  Bit Score: 35.58  E-value: 5.47e-03
                           10
                   ....*....|....*.
gi 32487722    260 DKCKNCGRLGHWAKDC 275
Cdd:pfam00098    1 GKCYNCGEPGHIARDC 16
 
Name Accession Description Interval E-value
RNase_HI_RT_Ty1 cd09272
Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1181-1319 5.61e-67

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260004  Cd Length: 140  Bit Score: 222.34  E-value: 5.61e-67
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722 1181 VGYCDADMAGNVDTRKSTSGVVFFLGANPVSWQSIKQKVVALSSCEAEYIVAMTAACQGIWLAQLLREIEQEEPQSFKLL 1260
Cdd:cd09272    1 EGYSDADWAGDPDDRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELGIPLDGPTTIY 80
                         90       100       110       120       130
                 ....*....|....*....|....*....|....*....|....*....|....*....
gi 32487722 1261 VDNKSAITLSKNPVFHDRSKHIATCYHFIHECVEDGRAQVEFIGTDGKVADILTKALGR 1319
Cdd:cd09272   81 CDNQSAIALAKNPVFHSRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLPR 139
RVT_2 pfam07727
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
971-1096 2.46e-37

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model.


Pssm-ID: 400190  Cd Length: 243  Bit Score: 141.19  E-value: 2.46e-37
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722    971 GLRQATsrarHAWNAKLDTSLLSLGFRRNDCEHTVYGRGSGDSRLLVGVYVDDLIITGNVVMEIDRFKAEMMSLFKMSDL 1050
Cdd:pfam07727  122 GLKQAP----YMWNTCITKVLMDLNFEPDTAESGMYCRGFGENKLIVGLYVDDMFITGSDITIINDFKLELAKHFKMKDL 197
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*.
gi 32487722   1051 GPLSFYLGIEVEQVADGVRLLQKLYAQCILERAGMQGCNSCSTPME 1096
Cdd:pfam07727  198 GDISEFLGIEFIQIAGGIRLSQHNYLNSVIKKFNLTNNNGKYTPII 243
RT_nLTR_like cd01650
RT_nLTR: Non-LTR (long terminal repeat) retrotransposon and non-LTR retrovirus reverse ...
1534-1627 9.22e-19

RT_nLTR: Non-LTR (long terminal repeat) retrotransposon and non-LTR retrovirus reverse transcriptase (RT). This subfamily contains both non-LTR retrotransposons and non-LTR retrovirus RTs. RTs catalyze the conversion of single-stranded RNA into double-stranded DNA for integration into host chromosomes. RT is a multifunctional enzyme with RNA-directed DNA polymerase, DNA directed DNA polymerase and ribonuclease hybrid (RNase H) activities.


Pssm-ID: 238827 [Multi-domain]  Cd Length: 220  Bit Score: 86.96  E-value: 9.22e-19
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722 1534 AFICLL-KKEDASGAEHYHPISLIHSFSKTISKLMANRLAPRLCELVSPNQSAFIRKRDIRDNFLYVQNMVQILHRTKKQ 1612
Cdd:cd01650    2 ARIILIpKKGKPSDPKNYRPISLLSVLYKLLEKILANRLRPVLEENILPNQFGFRPGRSTTDAILLLREVIEKAKEKKKS 81
                         90
                 ....*....|....*
gi 32487722 1613 SLFIKVDIAKAFDTV 1627
Cdd:cd01650   82 LVLVFLDFEKAFDSV 96
gag_pre-integrs pfam13976
GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements ...
448-515 4.88e-17

GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements and lies just upstream of the integrase region on the polyproteins.


Pssm-ID: 372857  Cd Length: 67  Bit Score: 76.64  E-value: 4.88e-17
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 32487722    448 RLYLIKI-TVARPVCLSARTQEAAWLWHARFGHLHFDALRRLTQQDMVRGLPQIDqvEQLCDCCVISKQ 515
Cdd:pfam13976    1 GLYLLDLsSVANSSIAVASKDDETWLWHRRLGHPSFKGLKKLVKKGLLPGLPISK--DLVCESCQLGKQ 67
transpos_IS481 NF033577
IS481 family transposase; null
533-645 4.88e-11

IS481 family transposase; null


Pssm-ID: 468094 [Multi-domain]  Cd Length: 283  Bit Score: 65.30  E-value: 4.88e-11
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722   533 ELIHGDLCGpiSPATPGGKKHFLL-LVDDASCYMWLTLLQNKGEAAAAikHFQARSEAESGCHLKLLHPDNGGEFTS--A 609
Cdd:NF033577  129 ELWHIDIKK--LGRIPDVGRLYLHtAIDDHSRFAYAELYPDETAETAA--DFLRRAFAEHGIPIRRVLTDNGSEFRSraH 204
                          90       100       110
                  ....*....|....*....|....*....|....*.
gi 32487722   610 EFASYCTESGVKRQLTPPYSPQQNGVVERRNQTILA 645
Cdd:NF033577  205 GFELALAELGIEHRRTRPYHPQTNGKVERFHRTLKD 240
rve pfam00665
Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...
533-630 2.13e-10

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.


Pssm-ID: 459897 [Multi-domain]  Cd Length: 98  Bit Score: 58.86  E-value: 2.13e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722    533 ELIHGDLCgPISPATPGGKKHFLLLVDDASCYMWLTLLQNKGEAAAAIKHFQARSEAESGCHLKLlHPDNGGEFTSAEFA 612
Cdd:pfam00665    3 QLWQGDFT-YIRIPGGGGKLYLLVIVDDFSREILAWALSSEMDAELVLDALERAIAFRGGVPLII-HSDNGSEYTSKAFR 80
                           90
                   ....*....|....*...
gi 32487722    613 SYCTESGVKRQLTPPYSP 630
Cdd:pfam00665   81 EFLKDLGIKPSFSRPGNP 98
Tra8 COG2826
Transposase and inactivated derivatives, IS30 family [Mobilome: prophages, transposons];
513-681 4.31e-08

Transposase and inactivated derivatives, IS30 family [Mobilome: prophages, transposons];


Pssm-ID: 442074 [Multi-domain]  Cd Length: 325  Bit Score: 56.81  E-value: 4.31e-08
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722  513 SKQRRSPFPGQSLFRE-----EDRLEL--IHGDLcgpISPAtpGGKKHFLLLVDDASCYMWLTLLQNKGeaAAAIKHFQA 585
Cdd:COG2826  146 TRKRRGKIPDRRSISErpaevEDRAEPghWEGDL---IIGK--RGKSALLTLVERKSRFVILLKLPDKT--AESVADALI 218
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722  586 RSEAESGCHLKL-LHPDNGGEFtsAEFASYCTESGVKRQLTPPYSPQQNGVVERRNQTI---LAMAQCLlraKSVPAcyw 661
Cdd:COG2826  219 RLLRKLPAFLRKsITTDNGKEF--ADHKEIEAALGIKVYFADPYSPWQRGTNENTNGLLrqyFPKGTDF---STVTQ--- 290
                        170       180
                 ....*....|....*....|
gi 32487722  662 gEAVMTVVFLLNHAPTKCLD 681
Cdd:COG2826  291 -EELDAIADRLNNRPRKCLG 309
Retrotran_gag_2 pfam14223
gag-polypeptide of LTR copia-type; This family is found in Plants and fungi, and contains ...
99-181 2.91e-07

gag-polypeptide of LTR copia-type; This family is found in Plants and fungi, and contains LTR-polyproteins, or retrotransposons of the copia-type.


Pssm-ID: 464108  Cd Length: 130  Bit Score: 51.09  E-value: 2.91e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722     99 VPADLLPVLSVKETAKDAWDAIKTMRVGVDRVRKSkaqELRKQFNAIEFKEGESVEEFSVRLSGLVNNLAVLGIQLEESK 178
Cdd:pfam14223    8 LSDSLLRLVRNADTAKEAWDKLESTYERKSPANKL---TLRRQLHSLKMKEGESVLEHINKFEELVNKLSALGVEISDED 84

                   ...
gi 32487722    179 KLT 181
Cdd:pfam14223   85 LVV 87
DUF4219 pfam13961
Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal ...
53-78 1.48e-06

Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal of many Gag-pol polyprotein and related proteins. There is a highly conserved YxxWxxxM sequence motif.


Pssm-ID: 433608  Cd Length: 27  Bit Score: 45.96  E-value: 1.48e-06
                           10        20
                   ....*....|....*....|....*.
gi 32487722     53 LTKTNYSDWSLLMRAMLQVRGLWEAV 78
Cdd:pfam13961    1 LDGDNYETWKLRMKLYLQAQDLWEVV 26
transpos_IS3 NF033516
IS3 family transposase;
597-642 3.31e-06

IS3 family transposase;


Pssm-ID: 468052 [Multi-domain]  Cd Length: 369  Bit Score: 51.41  E-value: 3.31e-06
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|....*.
gi 32487722   597 LLHPDNGGEFTSAEFASYCTESGVKRQLTPPYSPQQNGVVERRNQT 642
Cdd:NF033516  278 ILHSDNGSQYTSKAYREWLKEHGITQSMSRPGNCWDNAVAESFFGT 323
RVT_1 pfam00078
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
1539-1627 7.37e-06

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses.


Pssm-ID: 395031 [Multi-domain]  Cd Length: 189  Bit Score: 48.45  E-value: 7.37e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722   1539 LKKEDASGaehYHPISLIHSFSKTISKLMANRLAPRlcELVSPNQSAFIRkrdirdnflyvqnmvqILHRTKKQSLFIKV 1618
Cdd:pfam00078    1 IPKKGKGK---YRPISLLSIDYKALNKIIVKRLKPE--NLDSPPQPGFRP----------------GLAKLKKAKWFLKL 59

                   ....*....
gi 32487722   1619 DIAKAFDTV 1627
Cdd:pfam00078   60 DLKKAFDQV 68
zf-CCHC pfam00098
Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following ...
260-275 5.47e-03

Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger.


Pssm-ID: 395050 [Multi-domain]  Cd Length: 18  Bit Score: 35.58  E-value: 5.47e-03
                           10
                   ....*....|....*.
gi 32487722    260 DKCKNCGRLGHWAKDC 275
Cdd:pfam00098    1 GKCYNCGEPGHIARDC 16
 
Name Accession Description Interval E-value
RNase_HI_RT_Ty1 cd09272
Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1181-1319 5.61e-67

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260004  Cd Length: 140  Bit Score: 222.34  E-value: 5.61e-67
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722 1181 VGYCDADMAGNVDTRKSTSGVVFFLGANPVSWQSIKQKVVALSSCEAEYIVAMTAACQGIWLAQLLREIEQEEPQSFKLL 1260
Cdd:cd09272    1 EGYSDADWAGDPDDRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELGIPLDGPTTIY 80
                         90       100       110       120       130
                 ....*....|....*....|....*....|....*....|....*....|....*....
gi 32487722 1261 VDNKSAITLSKNPVFHDRSKHIATCYHFIHECVEDGRAQVEFIGTDGKVADILTKALGR 1319
Cdd:cd09272   81 CDNQSAIALAKNPVFHSRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLPR 139
RVT_2 pfam07727
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
971-1096 2.46e-37

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model.


Pssm-ID: 400190  Cd Length: 243  Bit Score: 141.19  E-value: 2.46e-37
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722    971 GLRQATsrarHAWNAKLDTSLLSLGFRRNDCEHTVYGRGSGDSRLLVGVYVDDLIITGNVVMEIDRFKAEMMSLFKMSDL 1050
Cdd:pfam07727  122 GLKQAP----YMWNTCITKVLMDLNFEPDTAESGMYCRGFGENKLIVGLYVDDMFITGSDITIINDFKLELAKHFKMKDL 197
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*.
gi 32487722   1051 GPLSFYLGIEVEQVADGVRLLQKLYAQCILERAGMQGCNSCSTPME 1096
Cdd:pfam07727  198 GDISEFLGIEFIQIAGGIRLSQHNYLNSVIKKFNLTNNNGKYTPII 243
RT_nLTR_like cd01650
RT_nLTR: Non-LTR (long terminal repeat) retrotransposon and non-LTR retrovirus reverse ...
1534-1627 9.22e-19

RT_nLTR: Non-LTR (long terminal repeat) retrotransposon and non-LTR retrovirus reverse transcriptase (RT). This subfamily contains both non-LTR retrotransposons and non-LTR retrovirus RTs. RTs catalyze the conversion of single-stranded RNA into double-stranded DNA for integration into host chromosomes. RT is a multifunctional enzyme with RNA-directed DNA polymerase, DNA directed DNA polymerase and ribonuclease hybrid (RNase H) activities.


Pssm-ID: 238827 [Multi-domain]  Cd Length: 220  Bit Score: 86.96  E-value: 9.22e-19
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722 1534 AFICLL-KKEDASGAEHYHPISLIHSFSKTISKLMANRLAPRLCELVSPNQSAFIRKRDIRDNFLYVQNMVQILHRTKKQ 1612
Cdd:cd01650    2 ARIILIpKKGKPSDPKNYRPISLLSVLYKLLEKILANRLRPVLEENILPNQFGFRPGRSTTDAILLLREVIEKAKEKKKS 81
                         90
                 ....*....|....*
gi 32487722 1613 SLFIKVDIAKAFDTV 1627
Cdd:cd01650   82 LVLVFLDFEKAFDSV 96
gag_pre-integrs pfam13976
GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements ...
448-515 4.88e-17

GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements and lies just upstream of the integrase region on the polyproteins.


Pssm-ID: 372857  Cd Length: 67  Bit Score: 76.64  E-value: 4.88e-17
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 32487722    448 RLYLIKI-TVARPVCLSARTQEAAWLWHARFGHLHFDALRRLTQQDMVRGLPQIDqvEQLCDCCVISKQ 515
Cdd:pfam13976    1 GLYLLDLsSVANSSIAVASKDDETWLWHRRLGHPSFKGLKKLVKKGLLPGLPISK--DLVCESCQLGKQ 67
transpos_IS481 NF033577
IS481 family transposase; null
533-645 4.88e-11

IS481 family transposase; null


Pssm-ID: 468094 [Multi-domain]  Cd Length: 283  Bit Score: 65.30  E-value: 4.88e-11
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722   533 ELIHGDLCGpiSPATPGGKKHFLL-LVDDASCYMWLTLLQNKGEAAAAikHFQARSEAESGCHLKLLHPDNGGEFTS--A 609
Cdd:NF033577  129 ELWHIDIKK--LGRIPDVGRLYLHtAIDDHSRFAYAELYPDETAETAA--DFLRRAFAEHGIPIRRVLTDNGSEFRSraH 204
                          90       100       110
                  ....*....|....*....|....*....|....*.
gi 32487722   610 EFASYCTESGVKRQLTPPYSPQQNGVVERRNQTILA 645
Cdd:NF033577  205 GFELALAELGIEHRRTRPYHPQTNGKVERFHRTLKD 240
rve pfam00665
Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...
533-630 2.13e-10

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.


Pssm-ID: 459897 [Multi-domain]  Cd Length: 98  Bit Score: 58.86  E-value: 2.13e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722    533 ELIHGDLCgPISPATPGGKKHFLLLVDDASCYMWLTLLQNKGEAAAAIKHFQARSEAESGCHLKLlHPDNGGEFTSAEFA 612
Cdd:pfam00665    3 QLWQGDFT-YIRIPGGGGKLYLLVIVDDFSREILAWALSSEMDAELVLDALERAIAFRGGVPLII-HSDNGSEYTSKAFR 80
                           90
                   ....*....|....*...
gi 32487722    613 SYCTESGVKRQLTPPYSP 630
Cdd:pfam00665   81 EFLKDLGIKPSFSRPGNP 98
Tra8 COG2826
Transposase and inactivated derivatives, IS30 family [Mobilome: prophages, transposons];
513-681 4.31e-08

Transposase and inactivated derivatives, IS30 family [Mobilome: prophages, transposons];


Pssm-ID: 442074 [Multi-domain]  Cd Length: 325  Bit Score: 56.81  E-value: 4.31e-08
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722  513 SKQRRSPFPGQSLFRE-----EDRLEL--IHGDLcgpISPAtpGGKKHFLLLVDDASCYMWLTLLQNKGeaAAAIKHFQA 585
Cdd:COG2826  146 TRKRRGKIPDRRSISErpaevEDRAEPghWEGDL---IIGK--RGKSALLTLVERKSRFVILLKLPDKT--AESVADALI 218
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722  586 RSEAESGCHLKL-LHPDNGGEFtsAEFASYCTESGVKRQLTPPYSPQQNGVVERRNQTI---LAMAQCLlraKSVPAcyw 661
Cdd:COG2826  219 RLLRKLPAFLRKsITTDNGKEF--ADHKEIEAALGIKVYFADPYSPWQRGTNENTNGLLrqyFPKGTDF---STVTQ--- 290
                        170       180
                 ....*....|....*....|
gi 32487722  662 gEAVMTVVFLLNHAPTKCLD 681
Cdd:COG2826  291 -EELDAIADRLNNRPRKCLG 309
Retrotran_gag_2 pfam14223
gag-polypeptide of LTR copia-type; This family is found in Plants and fungi, and contains ...
99-181 2.91e-07

gag-polypeptide of LTR copia-type; This family is found in Plants and fungi, and contains LTR-polyproteins, or retrotransposons of the copia-type.


Pssm-ID: 464108  Cd Length: 130  Bit Score: 51.09  E-value: 2.91e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722     99 VPADLLPVLSVKETAKDAWDAIKTMRVGVDRVRKSkaqELRKQFNAIEFKEGESVEEFSVRLSGLVNNLAVLGIQLEESK 178
Cdd:pfam14223    8 LSDSLLRLVRNADTAKEAWDKLESTYERKSPANKL---TLRRQLHSLKMKEGESVLEHINKFEELVNKLSALGVEISDED 84

                   ...
gi 32487722    179 KLT 181
Cdd:pfam14223   85 LVV 87
Tra5 COG2801
Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];
597-643 2.92e-07

Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];


Pssm-ID: 442053 [Multi-domain]  Cd Length: 309  Bit Score: 54.39  E-value: 2.92e-07
                         10        20        30        40
                 ....*....|....*....|....*....|....*....|....*..
gi 32487722  597 LLHPDNGGEFTSAEFASYCTESGVKRQLTPPYSPQQNGVVERRNQTI 643
Cdd:COG2801  211 ILHSDNGSQYTSKAYQELLKKLGITQSMSRPGNPQDNAFIESFFGTL 257
DUF4219 pfam13961
Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal ...
53-78 1.48e-06

Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal of many Gag-pol polyprotein and related proteins. There is a highly conserved YxxWxxxM sequence motif.


Pssm-ID: 433608  Cd Length: 27  Bit Score: 45.96  E-value: 1.48e-06
                           10        20
                   ....*....|....*....|....*.
gi 32487722     53 LTKTNYSDWSLLMRAMLQVRGLWEAV 78
Cdd:pfam13961    1 LDGDNYETWKLRMKLYLQAQDLWEVV 26
transpos_IS3 NF033516
IS3 family transposase;
597-642 3.31e-06

IS3 family transposase;


Pssm-ID: 468052 [Multi-domain]  Cd Length: 369  Bit Score: 51.41  E-value: 3.31e-06
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|....*.
gi 32487722   597 LLHPDNGGEFTSAEFASYCTESGVKRQLTPPYSPQQNGVVERRNQT 642
Cdd:NF033516  278 ILHSDNGSQYTSKAYREWLKEHGITQSMSRPGNCWDNAVAESFFGT 323
RVT_1 pfam00078
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
1539-1627 7.37e-06

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses.


Pssm-ID: 395031 [Multi-domain]  Cd Length: 189  Bit Score: 48.45  E-value: 7.37e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 32487722   1539 LKKEDASGaehYHPISLIHSFSKTISKLMANRLAPRlcELVSPNQSAFIRkrdirdnflyvqnmvqILHRTKKQSLFIKV 1618
Cdd:pfam00078    1 IPKKGKGK---YRPISLLSIDYKALNKIIVKRLKPE--NLDSPPQPGFRP----------------GLAKLKKAKWFLKL 59

                   ....*....
gi 32487722   1619 DIAKAFDTV 1627
Cdd:pfam00078   60 DLKKAFDQV 68
zf-CCHC pfam00098
Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following ...
260-275 5.47e-03

Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger.


Pssm-ID: 395050 [Multi-domain]  Cd Length: 18  Bit Score: 35.58  E-value: 5.47e-03
                           10
                   ....*....|....*.
gi 32487722    260 DKCKNCGRLGHWAKDC 275
Cdd:pfam00098    1 GKCYNCGEPGHIARDC 16
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH