GenBank Database Divisions
GenBank divisions are divided into two general categories and were described in an (Genome Research (1997) 7(10)) article by Ouellette and Boguski; the full-text article is available (Database Divisions and Homology Search Files: A Guide for the Perplexed). The "Organismal" category includes databases pertaining to sequences derived from specific organisms and the "Functional" databases pertain to different types of sequence data being collected. Sequence records exist only in one GenBank division. For example, the HTG division includes unfinished sequences (phases 0, 1, and 2) being generated from several different organisms. As a sequence is updated to phase 3, it is moved into the appropriate organismal division. For instance, human phase 3 (finished) HTG sequences are located in the PRI division. The GenBank divisions listed here represent the location of the annotated sequence records; for homology search purposes the records are reformatted and stored in the BLAST databases. The different database divisions currently available, as well as the related BLAST database, are listed below. An example of a submission (one accession number) that has progressed through phase 1, phase 2, and phase 3 is available (Examples).
Organismal Divisions
Database | Division | BLAST | Example |
---|---|---|---|
BCT | Bacterial sequences | nr, month | |
PRI | Primate sequences | nr, month | Human Phase 3 |
ROD | Rodent sequences | nr, month | |
MAM | Other mammalian sequences | nr, month | |
VRT | Other vertebrate sequences | nr, month | |
INV | Invertebrate sequences | nr, month | Drosophila, C. elegans Phase 3 |
PLN | Plant and Fungal sequences | nr, month | Arabidopsis Phase 3 |
VRL | Viral sequences | nr, month | |
PHG | Phage sequences | nr, month | |
RNA | Structural RNA sequences | nr, month | |
SYN | Synthetic and chimeric sequences | nr, month | |
UNA | Unannotated sequences | nr, month |
Functional Divisions
Database | Division | BLAST | Example |
---|---|---|---|
EST | Expressed Sequence Tags | dbest, month | |
STS | Sequence Tagged Sites | dbsts, month | |
GSS | Genome Survey Sequences | dbgss, month | |
HTG | High Throughput Genomic sequences | htgs, month | All Organisms: Phase 0, 1, and 2 |
- Phase 0 sequences are single-few pass reads of a single clone (not contigs usually).
- Phase 1 sequences are unfinished, unordered, and contain gaps.
- Phase 2 sequences are unfinished, ordered, and can contain one or more gaps.
- Phase 3 sequences are high quality finished sequences that do not contain gaps.