COGs stand for Clusters of Orthologous Genes. The database was initially created in 1997 (Tatusov et al., PMID: 9381173) followed by several updates, most recently in 2021 (Galperin et al., PMID: 33167031). The current update includes genomes from 2,103 bacterial and 193 archaeal species, of which 2,232 represent complete genomes in RefSeq and 64 are at the “Chromosome” level. The new features include updated COG annotations with corresponding references and PDB links, where available, and >100 new COGs for proteins involved in protein secretion pathways and those with unknown functions.
Statistics
COGs | Genomic loci | Taxonomic Categories |
Organims | Protein IDs | COG symbols |
---|---|---|---|---|---|
4,981 | 5,994,451 | 42 | 2,296 | 5,628,142 | 3,986 |