NCBI logo IEB banner

XML at NCBI

IEB ToolBox Page

About IEB
general and contact information

NCBI ToolBox
Supported software tools from IEB

Research within IEB
Research and Development Projects

ToolBox FTP site
download data and software


blue bulletWhat is the NCBI ToolBox?

Internally NCBI stores data in a variety of ways most appropriate to the flow of the data and its semantics. These may include normalized relational databases (eg. for ESTs), ASN.1 (eg. for other types of sequences), or XML (eg. for journal articles). NCBI also distributes the same data in a number of formats such as GenBank, FASTA, ASN.1, and XML, no matter how they are natively stored. For a particular nucleotide sequence of the human beta globin locus, options are:

  • GenBank format
  • FASTA format
  • ASN.1 format
  • Data Encoding - A formal specification and encoding rules. The telecommunications standard, ASN.1, has been used for this. This has also been mapped to XML.
  • Programming Libraries - Originally written in a portable dialect of C. This has also been written in C++.
The ToolBox model and code is used extensively within NCBI for the internal pipelines and tools such as GenBank, Entrez, BLAST, Sequin, OMIM, RefSeq, and others. We make the same tools available to the public domain for whatever purposes the community may desire. These tools are supported in the sense that they are designed to work in many environments outside NCBI, and as such we feel we can fix any bugs or answer questions about using them. Unfortunately they are not supported in the sense of a turnkey system with extensive documentation. However, there are applications set up in the distribution with standard makefiles, such as Sequin, BLAST, a program to convert ASN.1 data to XML, and others. But this distribution is primarily for serious programmers.

NCBI Data in XML
NCBI software tools can now automatically produce data as either ASN.1, as before, or as XML. This provides developers access to the full internal NCBI data set using a variety of open source tools. In addition, a number of specifications have been developed to present simpler views of the data in XML, specifically for use by applications developers outside NCBI. Entrez can display and download data in XML, and a standalone tool, asn2xml, can convert ASN.1 daily update files into XML on your site. More..

blue bulletContacting Us

Information is made available on this page to ToolBox programmers. In addition, you may ask questions by email to toolbox@ncbi.nlm.nih.gov .

Disclaimer     Privacy statement

Revised May 8, 2003


Hot Spots

arrowhead DataModel and C Toolkit Docs

arrowhead XML DTDs

arrowhead NCBI C++ Toolkit Docs

arrowhead NCBI Toolkit Source Browser