3D Macromolecular Structures
 
 
 
 
How to align a query protein to a similar sequence from a 3D structure
and interactively view sequence/structure relationships
 
 

 

The three-dimensional structures of biomolecules provide a wealth of information on their biological function and evolutionary relationships. Even if a 3D structure for your protein of interest has not yet been resolved, it is possible to align your query protein to a similar sequence from a 3D structure, then interactively view the sequence/structure relationships, as shown in the illustration. Use method 1 if the protein of interest already has a sequence record in the Entrez Protein database, and use method 2 if the protein sequence is not yet in that database. The additional notes near the bottom of the page provide step-by-step instructions on how to generate the view shown on this page and how to identify putative active site residues.

 

Method 1: If your protein of interest already has a sequence record in the Entrez Protein database:

  1. Open the Entrez Protein search page

  2. Retrieve your sequence record of interest, for example, human prostaglandin-endoperoxide synthase 1 isoform 1 precursor (accession NP_000953.2, gi 18104967), which is a product of the human PTGS1 gene (GeneID 5742).

  3. Scroll down to Analyze this sequence menu in the right margin of the protein sequence record display and select Run BLAST. In this example accession number NP_000953.2 will be shown the in the Enter Query Sequence panel.

  4. Under Choose Search Set choose Protein Data Bank (PDB) in the pulldown menu under Database. Under Program Selection choose blastp (protein-protein BLAST) under Algorithm and launch the query search by pressing the blue BLAST button.

  5. The summary displayed by pBLAST shows the related PDB structures, alignment footprints of the related structures relative to the query, and links that allow you to display the 3D structure and sequence alignment in iCn3D.

  6. To view a sequence alignment of the query and a hit of interest, click on the hit of interest in the first column of the Description in the table display or on the Alignment tab to view the alignment footprint hits.

  7. To view a structure alignment of the query and a hit of interest, click on the Alignment tab and click on Structure under the Related Information section on the right margin which will launch the iCn3d structure viewer and open an interactive view of the sequence alignment and corresponding 3D structure.

  8. Once the iCn3D display is open, you can click on any amino acids from the retrieved structure, in either the 3D structure or the sequence alignment window, to highlight their location in both views and examine the sequence/structure relationship. The documentation here provides more information about using the iCn3d program.

Method 2: If your sequence of interest is not yet available in the public database:

  1. If you have a protein query sequence, open the Protein BLAST (blastp) page and Choose Search Set: Protein Data Bank proteins (pdb).

    If you have a nucleotide query sequence and want to compare its translation against protein sequences from resolved 3D structures, open the Translated BLAST (blastx) page and Choose Search Set: Protein Data Bank proteins (pdb)

  2. Enter your query sequence in FASTA format. If desired, adjust the algorithm parameters to make the search more or less stringent than the default, then press the blue BLAST button at the bottom of the query page to execute the search.

  3. The summary displayed by pBLAST shows the related PDB structures, alignment footprints of the related structures relative to the query, and links that allow you to display the 3D structure and sequence alignment in iCn3D as in method 1 above.

Additional Notes:

  1. A query protein sequence can potentially retrieve many similar 3D-structure-based sequences. Some of the structures might be free proteins, other might be bound to another molecule such as a chemical or other protein. The salient features of each structure are described in the publication(s) cited on its MMDB summary page, accessible by clicking on a structure's thumbnail image.

  2. The 1PTH structure shown in the illustration is one of the structure neighbors found for protein sequence gi 18104967, as of 11 March 2021. 1PTH is featured here because it shows a chemical, salicylic acid, blocking the channel that leads to the protein's active site. Though it is a ovine protein, it is homologous to the human protein and can therefore be helpful in elucidating the biological function of the human protein.

 
HUMAN PROTEIN ALIGNED TO SHEEP HOMOLOG:
STRUCTURAL BASIS OF ASPIRIN ACTIVITY
(PDB accession 1PTH; MMDB ID 50885)
 
Image of a human query protein sequence aligned to a homologous sheep sequence that has a resolved 3D structure, viewed in the free Cn3D software program.  Click anywhere on the image for more information about the structure and for options to view it in Cn3D, where you can interactively examine sequence-structure relationships.
 
 
  In the example shown here, the human PTGS1 gene product is aligned to the sheep homolog, which has a known sequence and a resolved 3D structure. Salicylic acid is blocking the channel to the active site, where a heme cofactor is also shown. As a result, the protein is no longer able to convert its substrate, prostaglandin G2, to its prostaglandin H2 product, thereby reducing pain and inflammation. This inferred structural basis of aspirin activity is described in corresponding publications, which are accessible as links from the structure record.  
 
 
 
 
Revised 12 March 2021