NCBI Conserved Domain Search

Conserved domains on [gi|7839205|ref|NP_058189|]

View

gag-pol fusion protein [Saccharomyces cerevisiae S288C]

Protein Classification

TYA and RNase_HI_RT_Ty1 domain-containing protein( domain architecture ID 10470241)

protein containing domains TYA, rve, RVT_2, and RNase_HI_RT_Ty1

Graphical summary

Zoom to residue level

show extra options »

Show site features Horizontal zoom: ×

List of domain hits

Name

Accession

Description

Interval

E-value

TYA

pfam01021

Ty transposon capsid protein; Ty are yeast transposons. A 5.7kb transcript codes for p3 a ...

17-400

0e+00

Ty transposon capsid protein; Ty are yeast transposons. A 5.7kb transcript codes for p3 a fusion protein of TYA and TYB. The TYA protein is analogous to the gag protein of retroviruses. TYA a is cleaved to form 46kd protein which can form mature virion like particles. This entry corresponds to the capsid protein from Ty1 and Ty2 transposons.

Pssm-ID: 425992 Cd Length: 384 Bit Score: 771.05 E-value: 0e+00

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205      17 ACASVTSKEVHTNQDPLDVSASKTEECEKASTKANSQQTTTPASSAVPENPHHASPQPASVPPPQNGPYPQQCMMTQNQA 96
Cdd:pfam01021    1 ACASVTSKEVHTNQDPLDVSASKLQEYDKDSTKANSQQTTTPGSSAVPENHHHASPQPASVPPPQNGPYSQQCMMTPNQA 80

                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205      97 NPSGWSFYGHPSMIPYTPYQMSPMYFPPGPQSQFPQYPSSVGTPLSTPSPESGNTFTDSSSADSDMTSTKKYVRPPPMLT 176
Cdd:pfam01021   81 NPSGWPFYGHPSMMPYTPYQMSPMYFPPGPQSQFPQYPSSVGTPLSTPSPESGNTFTDSSSAKSDMTSTNKYVRPPPILT 160

                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205     177 SPNDFPNWVKTYIKFLQNSNLGGIIPTVNGKPVRQITDDELTFLYNTFQIFAPSQFLPTWVKDILSVDYTDIMKILSKSI 256
Cdd:pfam01021  161 SPNDFLNWVKTYIKFLQNSNLGDIIPTATGKAVRQMTDDEHTFLYNTFQLFAPSQFLPTWVKDILSVDYTDIMKILSKSI 240

                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205     257 EKMQSDTQEANDIVTLANLQYNGSTPADAFETKVTNIIDRLNNNGIHINNKVACQLIMRGLSGEYKFLRYTRHRHLNMTV 336
Cdd:pfam01021  241 EKMQSDTQEVNDIITLANLHYNGSTPADTFETTVTNIIDRLNNNGININDKVACQLIMRGLSGEYKFLRYTRHRHINMTV 320

                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 7839205     337 AELFLDIHAIYEEQQGSRNSKPNYRRNLSDEKNDSRSYTNTTKPKVIARNPQKTNNSKSKTARA 400
Cdd:pfam01021  321 ADLFSDIHAIYEEQQESKRNKPTYRRNPSDEKNDSRTYTNTTKTKVITRNSQKTNNSKSRTAKA 384

RNase_HI_RT_Ty1

cd09272

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...

1606-1742

8.38e-27

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.

Pssm-ID: 260004 Cd Length: 140 Bit Score: 107.55 E-value: 8.38e-27

                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205  1606 VAISDASYGNQPY-YKSQIGNIYLLNGKVIGGKSTKASLTCTSTTEAEIHAISESVPLLNNLSYLIQEL---NKKPIIkg 1681
Cdd:cd09272    1 EGYSDADWAGDPDdRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELgipLDGPTT-- 78

                         90       100       110       120       130       140
                 ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 7839205  1682 LLTDSRSTISIIKStneEKF--RNRFFGTKAMRLRDEVSGNNLYVYYIETKKNIADVMTKPLP 1742
Cdd:cd09272   79 IYCDNQSAIALAKN---PVFhsRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLP 138

RVT_2 super family

cl06662

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...

1280-1498

3.21e-23

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model.

The actual alignment was detected with superfamily member pfam07727:

Pssm-ID: 400190 Cd Length: 243 Bit Score: 100.74 E-value: 3.21e-23

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205    1280 PK--RVINSMFIFNKK----RDGTHKARFVARGDIQHPDTYDSGMQSNTVHHYALMTSLSLALDNNYYITQLDISSAYLY 1353
Cdd:pfam07727   10 PKnvKPIGTTWVHTHKindlKEVQYKARLVAQGFRQIAGEDYDKVFSPVIRLSSVRLLLAIAAEYEWPVHHMDVSSAFLN 89

                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205    1354 ADIKEELYIRPPPHL---GMNDKLIRLKKSLYGLKQSGANWYETIKSYLIK--------QCGMeEVRGwscvFKNSQVTI 1422
Cdd:pfam07727   90 GDIDEEIYVKQPPGFnidNESGKVWQLNKSLYGLKQAPYMWNTCITKVLMDlnfepdtaESGM-YCRG----FGENKLIV 164

                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 7839205    1423 CLFVDDMILFSKDLNANKKIITTLKKQYDTKiiNLGESDneiqyDILGLEIKYQRGKYmKLGMENSLTEKIPKLNV 1498
Cdd:pfam07727  165 GLYVDDMFITGSDITIINDFKLELAKHFKMK--DLGDIS-----EFLGIEFIQIAGGI-RLSQHNYLNSVIKKFNL 232

rve

pfam00665

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...

664-765

8.55e-12

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.

Pssm-ID: 459897 [Multi-domain] Cd Length: 98 Bit Score: 63.10 E-value: 8.55e-12

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205     664 PFQYLHTDIFgPVHNLPNSAPSYFISFTDETTKFRWVYPLhdRREDSILDVFTTILAFIKNQFQaSVLVIQMDRGSEYTN 743
Cdd:pfam00665    1 PNQLWQGDFT-YIRIPGGGGKLYLLVIVDDFSREILAWAL--SSEMDAELVLDALERAIAFRGG-VPLIIHSDNGSEYTS 76

                           90       100
                   ....*....|....*....|..
gi 7839205     744 RTLHKFLEKNGITPCYTTTADS 765
Cdd:pfam00665   77 KAFREFLKDLGIKPSFSRPGNP 98

Name

Accession

Description

Interval

E-value

TYA

pfam01021

Ty transposon capsid protein; Ty are yeast transposons. A 5.7kb transcript codes for p3 a ...

17-400

0e+00

Pssm-ID: 425992 Cd Length: 384 Bit Score: 771.05 E-value: 0e+00

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205      17 ACASVTSKEVHTNQDPLDVSASKTEECEKASTKANSQQTTTPASSAVPENPHHASPQPASVPPPQNGPYPQQCMMTQNQA 96
Cdd:pfam01021    1 ACASVTSKEVHTNQDPLDVSASKLQEYDKDSTKANSQQTTTPGSSAVPENHHHASPQPASVPPPQNGPYSQQCMMTPNQA 80

                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205      97 NPSGWSFYGHPSMIPYTPYQMSPMYFPPGPQSQFPQYPSSVGTPLSTPSPESGNTFTDSSSADSDMTSTKKYVRPPPMLT 176
Cdd:pfam01021   81 NPSGWPFYGHPSMMPYTPYQMSPMYFPPGPQSQFPQYPSSVGTPLSTPSPESGNTFTDSSSAKSDMTSTNKYVRPPPILT 160

                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205     177 SPNDFPNWVKTYIKFLQNSNLGGIIPTVNGKPVRQITDDELTFLYNTFQIFAPSQFLPTWVKDILSVDYTDIMKILSKSI 256
Cdd:pfam01021  161 SPNDFLNWVKTYIKFLQNSNLGDIIPTATGKAVRQMTDDEHTFLYNTFQLFAPSQFLPTWVKDILSVDYTDIMKILSKSI 240

                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205     257 EKMQSDTQEANDIVTLANLQYNGSTPADAFETKVTNIIDRLNNNGIHINNKVACQLIMRGLSGEYKFLRYTRHRHLNMTV 336
Cdd:pfam01021  241 EKMQSDTQEVNDIITLANLHYNGSTPADTFETTVTNIIDRLNNNGININDKVACQLIMRGLSGEYKFLRYTRHRHINMTV 320

                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 7839205     337 AELFLDIHAIYEEQQGSRNSKPNYRRNLSDEKNDSRSYTNTTKPKVIARNPQKTNNSKSKTARA 400
Cdd:pfam01021  321 ADLFSDIHAIYEEQQESKRNKPTYRRNPSDEKNDSRTYTNTTKTKVITRNSQKTNNSKSRTAKA 384

RNase_HI_RT_Ty1

cd09272

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...

1606-1742

8.38e-27

Pssm-ID: 260004 Cd Length: 140 Bit Score: 107.55 E-value: 8.38e-27

                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205  1606 VAISDASYGNQPY-YKSQIGNIYLLNGKVIGGKSTKASLTCTSTTEAEIHAISESVPLLNNLSYLIQEL---NKKPIIkg 1681
Cdd:cd09272    1 EGYSDADWAGDPDdRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELgipLDGPTT-- 78

                         90       100       110       120       130       140
                 ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 7839205  1682 LLTDSRSTISIIKStneEKF--RNRFFGTKAMRLRDEVSGNNLYVYYIETKKNIADVMTKPLP 1742
Cdd:cd09272   79 IYCDNQSAIALAKN---PVFhsRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLP 138

RVT_2

pfam07727

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...

1280-1498

3.21e-23

Pssm-ID: 400190 Cd Length: 243 Bit Score: 100.74 E-value: 3.21e-23

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205    1280 PK--RVINSMFIFNKK----RDGTHKARFVARGDIQHPDTYDSGMQSNTVHHYALMTSLSLALDNNYYITQLDISSAYLY 1353
Cdd:pfam07727   10 PKnvKPIGTTWVHTHKindlKEVQYKARLVAQGFRQIAGEDYDKVFSPVIRLSSVRLLLAIAAEYEWPVHHMDVSSAFLN 89

                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205    1354 ADIKEELYIRPPPHL---GMNDKLIRLKKSLYGLKQSGANWYETIKSYLIK--------QCGMeEVRGwscvFKNSQVTI 1422
Cdd:pfam07727   90 GDIDEEIYVKQPPGFnidNESGKVWQLNKSLYGLKQAPYMWNTCITKVLMDlnfepdtaESGM-YCRG----FGENKLIV 164

                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 7839205    1423 CLFVDDMILFSKDLNANKKIITTLKKQYDTKiiNLGESDneiqyDILGLEIKYQRGKYmKLGMENSLTEKIPKLNV 1498
Cdd:pfam07727  165 GLYVDDMFITGSDITIINDFKLELAKHFKMK--DLGDIS-----EFLGIEFIQIAGGI-RLSQHNYLNSVIKKFNL 232

rve

pfam00665

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...

664-765

8.55e-12

Pssm-ID: 459897 [Multi-domain] Cd Length: 98 Bit Score: 63.10 E-value: 8.55e-12

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205     664 PFQYLHTDIFgPVHNLPNSAPSYFISFTDETTKFRWVYPLhdRREDSILDVFTTILAFIKNQFQaSVLVIQMDRGSEYTN 743
Cdd:pfam00665    1 PNQLWQGDFT-YIRIPGGGGKLYLLVIVDDFSREILAWAL--SSEMDAELVLDALERAIAFRGG-VPLIIHSDNGSEYTS 76

                           90       100
                   ....*....|....*....|..
gi 7839205     744 RTLHKFLEKNGITPCYTTTADS 765
Cdd:pfam00665   77 KAFREFLKDLGIKPSFSRPGNP 98

Tra5

COG2801

Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];

663-782

4.28e-08

Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];

Pssm-ID: 442053 [Multi-domain] Cd Length: 309 Bit Score: 57.09 E-value: 4.28e-08

                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205   663 EPFQYLHTDIFgpvhnlpnsapsYFisftdeTTKFRWVYPLhdrredSILDVFT-TILAF------------------IK 723
Cdd:COG2801  147 APNQVWVTDIT------------YI------PTAEGWLYLA------AVIDLFSrEIVGWsvsdsmdaelvvdalemaIE 202

                         90       100       110       120       130
                 ....*....|....*....|....*....|....*....|....*....|....*....
gi 7839205   724 NQFQASVLVIQMDRGSEYTNRTLHKFLEKNGITPCYTTTADSRAHGVAERLNRTLLDDC 782
Cdd:COG2801  203 RRGPPKPLILHSDNGSQYTSKAYQELLKKLGITQSMSRPGNPQDNAFIESFFGTLKYEL 261

transpos_IS481

NF033577

IS481 family transposase; null

652-780

5.99e-08

IS481 family transposase; null

Pssm-ID: 468094 [Multi-domain] Cd Length: 283 Bit Score: 56.06 E-value: 5.99e-08

                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205    652 KGSRLKYQNSYePFQYLHTDIFGpVHNLPNSAPSYFISFTDETTKFRWVYPLHDRREDSILDVFTTILAfiknQFQASVL 731
Cdd:NF033577  116 TGKVKRYERAH-PGELWHIDIKK-LGRIPDVGRLYLHTAIDDHSRFAYAELYPDETAETAADFLRRAFA----EHGIPIR 189

                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|.
gi 7839205    732 VIQMDRGSEYTNRT--LHKFLEKNGITPCYTTTADSRAHGVAERLNRTLLD 780
Cdd:NF033577  190 RVLTDNGSEFRSRAhgFELALAELGIEHRRTRPYHPQTNGKVERFHRTLKD 240

transpos_IS3

NF033516

IS3 family transposase;

722-782

1.64e-06

IS3 family transposase;

Pssm-ID: 468052 [Multi-domain] Cd Length: 369 Bit Score: 52.18 E-value: 1.64e-06

                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 7839205    722 IKNQFQASVLVIQMDRGSEYTNRTLHKFLEKNGITPCYTTTADSRAHGVAERLNRTLLDDC 782
Cdd:NF033516  268 IEWRGKPEGLILHSDNGSQYTSKAYREWLKEHGITQSMSRPGNCWDNAVAESFFGTLKREC 328

Amelogenin

smart00818

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...

53-146

2.18e-04

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.

Pssm-ID: 197891 [Multi-domain] Cd Length: 165 Bit Score: 43.62 E-value: 2.18e-04

                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205       53 QQTTTPASSAVPenPHHASPQPASVP--PPQNGPYPQQCMMT---QNQANPSGwsfyGHPSMIPYTPYQMSPMYFPPGPQ 127
Cdd:smart00818   36 HHQIIPVSQQHP--PTHTLQPHHHIPvlPAQQPVVPQQPLMPvpgQHSMTPTQ----HHQPNLPQPAQQPFQPQPLQPPQ 109

                            90
                    ....*....|....*....
gi 7839205      128 SQFPQYPSSVGTPLSTPSP 146
Cdd:smart00818  110 PQQPMQPQPPVHPIPPLPP 128

PHA02517

putative transposase OrfB; Reviewed

690-782

1.69e-03

putative transposase OrfB; Reviewed

Pssm-ID: 222853 [Multi-domain] Cd Length: 277 Bit Score: 42.54 E-value: 1.69e-03

                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205    690 FTDETTKFRWVY-----PLHDRR---------EDSILdVFTTILAFIKNQFQASVLVIQMDRGSEYTNRTLHKFLEKNGI 755
Cdd:PHA02517  117 FTYVSTWQGWVYvafiiDVFARRivgwrvsssMDTDF-VLDALEQALWARGRPGGLIHHSDKGSQYVSLAYTQRLKEAGI 195

                          90       100
                  ....*....|....*....|....*..
gi 7839205    756 TPCYTTTADSRAHGVAERLNRTLLDDC 782
Cdd:PHA02517  196 RASTGSRGDSYDNAPAESINGLYKAEV 222

Name

Accession

Description

Interval

E-value

TYA

pfam01021

Ty transposon capsid protein; Ty are yeast transposons. A 5.7kb transcript codes for p3 a ...

17-400

0e+00

Pssm-ID: 425992 Cd Length: 384 Bit Score: 771.05 E-value: 0e+00

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205      17 ACASVTSKEVHTNQDPLDVSASKTEECEKASTKANSQQTTTPASSAVPENPHHASPQPASVPPPQNGPYPQQCMMTQNQA 96
Cdd:pfam01021    1 ACASVTSKEVHTNQDPLDVSASKLQEYDKDSTKANSQQTTTPGSSAVPENHHHASPQPASVPPPQNGPYSQQCMMTPNQA 80

                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205      97 NPSGWSFYGHPSMIPYTPYQMSPMYFPPGPQSQFPQYPSSVGTPLSTPSPESGNTFTDSSSADSDMTSTKKYVRPPPMLT 176
Cdd:pfam01021   81 NPSGWPFYGHPSMMPYTPYQMSPMYFPPGPQSQFPQYPSSVGTPLSTPSPESGNTFTDSSSAKSDMTSTNKYVRPPPILT 160

                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205     177 SPNDFPNWVKTYIKFLQNSNLGGIIPTVNGKPVRQITDDELTFLYNTFQIFAPSQFLPTWVKDILSVDYTDIMKILSKSI 256
Cdd:pfam01021  161 SPNDFLNWVKTYIKFLQNSNLGDIIPTATGKAVRQMTDDEHTFLYNTFQLFAPSQFLPTWVKDILSVDYTDIMKILSKSI 240

                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205     257 EKMQSDTQEANDIVTLANLQYNGSTPADAFETKVTNIIDRLNNNGIHINNKVACQLIMRGLSGEYKFLRYTRHRHLNMTV 336
Cdd:pfam01021  241 EKMQSDTQEVNDIITLANLHYNGSTPADTFETTVTNIIDRLNNNGININDKVACQLIMRGLSGEYKFLRYTRHRHINMTV 320

                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 7839205     337 AELFLDIHAIYEEQQGSRNSKPNYRRNLSDEKNDSRSYTNTTKPKVIARNPQKTNNSKSKTARA 400
Cdd:pfam01021  321 ADLFSDIHAIYEEQQESKRNKPTYRRNPSDEKNDSRTYTNTTKTKVITRNSQKTNNSKSRTAKA 384

RNase_HI_RT_Ty1

cd09272

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...

1606-1742

8.38e-27

Pssm-ID: 260004 Cd Length: 140 Bit Score: 107.55 E-value: 8.38e-27

                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205  1606 VAISDASYGNQPY-YKSQIGNIYLLNGKVIGGKSTKASLTCTSTTEAEIHAISESVPLLNNLSYLIQEL---NKKPIIkg 1681
Cdd:cd09272    1 EGYSDADWAGDPDdRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELgipLDGPTT-- 78

                         90       100       110       120       130       140
                 ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 7839205  1682 LLTDSRSTISIIKStneEKF--RNRFFGTKAMRLRDEVSGNNLYVYYIETKKNIADVMTKPLP 1742
Cdd:cd09272   79 IYCDNQSAIALAKN---PVFhsRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLP 138

RVT_2

pfam07727

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...

1280-1498

3.21e-23

Pssm-ID: 400190 Cd Length: 243 Bit Score: 100.74 E-value: 3.21e-23

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205    1280 PK--RVINSMFIFNKK----RDGTHKARFVARGDIQHPDTYDSGMQSNTVHHYALMTSLSLALDNNYYITQLDISSAYLY 1353
Cdd:pfam07727   10 PKnvKPIGTTWVHTHKindlKEVQYKARLVAQGFRQIAGEDYDKVFSPVIRLSSVRLLLAIAAEYEWPVHHMDVSSAFLN 89

                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205    1354 ADIKEELYIRPPPHL---GMNDKLIRLKKSLYGLKQSGANWYETIKSYLIK--------QCGMeEVRGwscvFKNSQVTI 1422
Cdd:pfam07727   90 GDIDEEIYVKQPPGFnidNESGKVWQLNKSLYGLKQAPYMWNTCITKVLMDlnfepdtaESGM-YCRG----FGENKLIV 164

                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 7839205    1423 CLFVDDMILFSKDLNANKKIITTLKKQYDTKiiNLGESDneiqyDILGLEIKYQRGKYmKLGMENSLTEKIPKLNV 1498
Cdd:pfam07727  165 GLYVDDMFITGSDITIINDFKLELAKHFKMK--DLGDIS-----EFLGIEFIQIAGGI-RLSQHNYLNSVIKKFNL 232

rve

pfam00665

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...

664-765

8.55e-12

Pssm-ID: 459897 [Multi-domain] Cd Length: 98 Bit Score: 63.10 E-value: 8.55e-12

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205     664 PFQYLHTDIFgPVHNLPNSAPSYFISFTDETTKFRWVYPLhdRREDSILDVFTTILAFIKNQFQaSVLVIQMDRGSEYTN 743
Cdd:pfam00665    1 PNQLWQGDFT-YIRIPGGGGKLYLLVIVDDFSREILAWAL--SSEMDAELVLDALERAIAFRGG-VPLIIHSDNGSEYTS 76

                           90       100
                   ....*....|....*....|..
gi 7839205     744 RTLHKFLEKNGITPCYTTTADS 765
Cdd:pfam00665   77 KAFREFLKDLGIKPSFSRPGNP 98

Tra5

COG2801

Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];

663-782

4.28e-08

Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];

Pssm-ID: 442053 [Multi-domain] Cd Length: 309 Bit Score: 57.09 E-value: 4.28e-08

                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205   663 EPFQYLHTDIFgpvhnlpnsapsYFisftdeTTKFRWVYPLhdrredSILDVFT-TILAF------------------IK 723
Cdd:COG2801  147 APNQVWVTDIT------------YI------PTAEGWLYLA------AVIDLFSrEIVGWsvsdsmdaelvvdalemaIE 202

                         90       100       110       120       130
                 ....*....|....*....|....*....|....*....|....*....|....*....
gi 7839205   724 NQFQASVLVIQMDRGSEYTNRTLHKFLEKNGITPCYTTTADSRAHGVAERLNRTLLDDC 782
Cdd:COG2801  203 RRGPPKPLILHSDNGSQYTSKAYQELLKKLGITQSMSRPGNPQDNAFIESFFGTLKYEL 261

transpos_IS481

NF033577

IS481 family transposase; null

652-780

5.99e-08

IS481 family transposase; null

Pssm-ID: 468094 [Multi-domain] Cd Length: 283 Bit Score: 56.06 E-value: 5.99e-08

                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205    652 KGSRLKYQNSYePFQYLHTDIFGpVHNLPNSAPSYFISFTDETTKFRWVYPLHDRREDSILDVFTTILAfiknQFQASVL 731
Cdd:NF033577  116 TGKVKRYERAH-PGELWHIDIKK-LGRIPDVGRLYLHTAIDDHSRFAYAELYPDETAETAADFLRRAFA----EHGIPIR 189

                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|.
gi 7839205    732 VIQMDRGSEYTNRT--LHKFLEKNGITPCYTTTADSRAHGVAERLNRTLLD 780
Cdd:NF033577  190 RVLTDNGSEFRSRAhgFELALAELGIEHRRTRPYHPQTNGKVERFHRTLKD 240

Atrophin-1

pfam03154

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...

53-197

5.24e-07

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.

Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 54.77 E-value: 5.24e-07

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205      53 QQTTTPASSAVPeNPHHA-SPQPASVPPPQNGPYPQQCMMTQNQANPSGWSFYGHPSMIPY--------TPYQMSPMYFP 123
Cdd:pfam03154  232 QQTPTLHPQRLP-SPHPPlQPMTQPPPPSQVSPQPLPQPSLHGQMPPMPHSLQTGPSHMQHpvppqpfpLTPQSSQSQVP 310

                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205     124 PGPQSQFP-QYPSSVGTPLSTPSPESGNTFTDSSSADSDMtsTKKYVRPPPMLTSP-------NDFPNWVKTYIKFLQNS 195
Cdd:pfam03154  311 PGPSPAAPgQSQQRIHTPPSQSQLQSQQPPREQPLPPAPL--SMPHIKPPPTTPIPqlpnpqsHKHPPHLSGPSPFQMNS 388


                   ..
gi 7839205     196 NL 197
Cdd:pfam03154  389 NL 390

transpos_IS3

NF033516

IS3 family transposase;

722-782

1.64e-06

IS3 family transposase;

Pssm-ID: 468052 [Multi-domain] Cd Length: 369 Bit Score: 52.18 E-value: 1.64e-06

                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 7839205    722 IKNQFQASVLVIQMDRGSEYTNRTLHKFLEKNGITPCYTTTADSRAHGVAERLNRTLLDDC 782
Cdd:NF033516  268 IEWRGKPEGLILHSDNGSQYTSKAYREWLKEHGITQSMSRPGNCWDNAVAESFFGTLKREC 328

Amelogenin

smart00818

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...

53-146

2.18e-04

Pssm-ID: 197891 [Multi-domain] Cd Length: 165 Bit Score: 43.62 E-value: 2.18e-04

                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205       53 QQTTTPASSAVPenPHHASPQPASVP--PPQNGPYPQQCMMT---QNQANPSGwsfyGHPSMIPYTPYQMSPMYFPPGPQ 127
Cdd:smart00818   36 HHQIIPVSQQHP--PTHTLQPHHHIPvlPAQQPVVPQQPLMPvpgQHSMTPTQ----HHQPNLPQPAQQPFQPQPLQPPQ 109

                            90
                    ....*....|....*....
gi 7839205      128 SQFPQYPSSVGTPLSTPSP 146
Cdd:smart00818  110 PQQPMQPQPPVHPIPPLPP 128

PAT1

pfam09770

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...

41-141

1.49e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.

Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 43.49 E-value: 1.49e-03

                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205      41 EECEkASTKANSQQTTTPASSAVPENPHHASPQPASVPPPQNGPYPQQCMMTQNQANPSGWSFYGHPSMI----PYTPYQ 116
Cdd:pfam09770  197 EEVE-AAMRAQAKKPAQQPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTIlqrpQSPQPD 275

                           90       100
                   ....*....|....*....|....*
gi 7839205     117 MSPMYFPPGPQSQFPQYPSSVGTPL 141
Cdd:pfam09770  276 PAQPSIQPQAQQFHQQPPPVPVQPT 300

PHA02517

putative transposase OrfB; Reviewed

690-782

1.69e-03

putative transposase OrfB; Reviewed

Pssm-ID: 222853 [Multi-domain] Cd Length: 277 Bit Score: 42.54 E-value: 1.69e-03

                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839205    690 FTDETTKFRWVY-----PLHDRR---------EDSILdVFTTILAFIKNQFQASVLVIQMDRGSEYTNRTLHKFLEKNGI 755
Cdd:PHA02517  117 FTYVSTWQGWVYvafiiDVFARRivgwrvsssMDTDF-VLDALEQALWARGRPGGLIHHSDKGSQYVSLAYTQRLKEAGI 195

                          90       100
                  ....*....|....*....|....*..
gi 7839205    756 TPCYTTTADSRAHGVAERLNRTLLDDC 782
Cdd:PHA02517  196 RASTGSRGDSYDNAPAESINGLYKAEV 222

Blast search parameters

Data Source:	Precalculated data, version = cdd.v.3.21
Preset Options:	Database: CDSEARCH/cdd Low complexity filter: no Composition Based Adjustment: yes E-value threshold: 0.01