Support for Genome Workbench will end on March 31 2024. You may still use the application, but supporting documentation will not be available after this date. Read more.
Genome Submission Wizard
Genome Submission Wizard opens a stand-alone, tabbed dialog in which a genome submission can be created. The process is similar to that used by other NCBI submission tools BankIt and Sequin to direct the input of sequence and other metadata to create a complete submission for GenBank. The submission file can then be further edited and validated in Genome Workbench before submitting the file to GenBank.
Open the dialog by choosing Genome Submission Wizard, the first item in the Submission menu, or by opening an existing FASTA or ASN.1 file with the File Open dialog. When starting the Submission Wizard, a submitter will be prompted to open a FASTA or ASN.1 file if no file is already open. If a file is already open, the Submission Wizard will import any information contained within the open file.
Information entered into the Submission Wizard can be exported and imported as template files. Template files can also be generated using the GenBank Submission Template.
Submitter
Submitter Name
The Submitter/Name page in the Submission Wizard collects name and email contact information for the person primarily responsible for the submission. This contact name, the names listed as the authors of the sequence, and the authors of journal articles or other publications referencing this submission can be different but will be copied by default. The contact name will not appear in the GenBank record unless it is also listed as a sequence author or as an author on the related publication (see the Reference tab for more information).
Submitter Affiliation
The Submitter/Affiliation page collects additional information about the submitter. The affiliation information will also be used for the sequence authors and associated publications. State/province names are optional and should be entered only for those countries that use them. Items with an asterisk are required.
General
The General tab page collects the BioProject and BioSample accessions associated with this submission. A BioProject is a collection of biological data related to a single initiative, originating from a single organization or from a consortium. The BioSample database contains descriptions of biological source materials used in experimental assays. See BioProjectand BioSample pages for information on how to obtain BioProject and BioSample accession numbers.
The BioProject and BioSample accessions will be listed in the flatfile display in the DBLink section.
The General tab also collects the release date for the submission. Select a release date within the next three years or select “Immediately after processing” to have the genome released as soon as its processing has completed. The user can request an extension to the release date for an unreleased genome at any time. However, the genome will be released on its release date or when the accession or data is published or is publicly available, whichever is first.
Genome Info
Genome Info - Assembly
The Genome info/Assembly page collects information about how the genome was assembled.
Genome Info - Sequencing Information
The Genome info/Sequencing information page collects information about how the genome was sequenced.
Genome submissions require assembly information to be included within the Genome Assembly-Data structured comment. This structured comment includes the following metadata:
- Assembly Name: a short name suitable for display (for example, LoxAfr_3.0 for a Loxodonta africana assembly, version 3.0)
- Assembly Method: includes version or date the program was run (for example, Newbler v. 2.3 or Celera Assembly v. May 2010)
- Genome Coverage (for example, 12x)
- Sequencing Technology (for example, ABI 3730; 454 GS-FLX Titanium; Illumina GAIIx)
The Assembly Name is optional. Assembly Method requires 'v. ' between the algorithm name and its version (or the month and year it was run). If more than one sequencing technology was used, separate them with a semi-colon (for example, Sanger; Illumina GAIIx)
The information entered in these tabs will be displayed in the sequence record as a Genome Data Assembly comment, which appears in the COMMENT section and is framed by the default tags ##Genome-Assembly-Data-START## and ##Genome-Assembly-Data-END##.
This information can also be applied to the sequence data using the Submission->Comments->GenomeDataAssembly Comment menu item, and can be edited by clicking on the pen icon () in the left margin of the flatfile record view.
Organism Info
The Organism info tab’s General and Additional qualifiers pages collect information about the biological source from which the sequence was isolated.
Organism Info - General
Organism can be an established, known name, or it can be a new name, which the NCBI Taxonomy group will verify during processing. A strain name can be an established, known name, or it can be a new strain of an established or new organism. Isolate, cultivar, or breed may be reported instead of strain, but at least one of the four values must be supplied.
Organism Info - Additional Qualifiers
Organism information values can be entered into the fields individually, or a tab-delimited table of values can be imported using the ‘Import tab-delimited table’ button. (This table reader can also be accessed from the Submission->Import->Use Table Reader menu item.)
The information from these pages will be displayed in the sequence record as a source feature with associated qualifiers in the Features section.
The submitter can also edit biological source information using the Submission->Tools->Bulk Source Edit menu item, where the submitter can add different information for different sequences. The user can also edit the source information for a single sequence by clicking on the pen icon () in the left margin next to the source feature in the flatfile record view.
The Genome Submission Wizard is intended to be used for sequences where source information is the same for all sequences, with the exception of identifying sequences as chromosomes, plasmids, or organelles
Molecule Info
The Molecule info tab has three pages with options that allow the submitter to label sequences as chromosomes, plasmids, or organelles, as appropriate.
Molecule Info - Chromosome
If an organism has only one chromosome, no chromosome name is needed. If the organism has multiple chromosomes, each sequence that is localized to a chromosome should be labeled with the appropriate chromosome name.
The checkbox for “Is the chromosome” should be checked only if the chromosome is represented by a single sequence in the submission and if that sequence represents the entire chromosome (with or without gaps). If an individual chromosome is in more than one sequence, do not select the “Is the chromosome” option.
If the “Is the chromosome” option is selected, and the biological topology of the chromosome is circular check the circular box. If your circular chromosome is in one piece, but you were unable to circularize the molecule because there is a gap between the ends, add 100 N's to the end of the sequence to indicate the gap. If the ends of the circle overlap, the sequence should be trimmed so that the ends abut with no overlap. Fragments of circular chromosomes should not be marked as circular. If a chromosome is in more than one sequence, do not select the "circular" option.
Molecule Info - Plasmid
The Plasmid page allows a submitter to label sequences that represent plasmids or pieces of plasmids. If a plasmid sequence has no gaps and represents the complete molecule, it should be marked as complete and circular (if appropriate). If a circular plasmid sequence has gaps, but represents the complete molecule, it should still be marked as circular but not complete. If the sequence is just a fragment of the plasmid it should still be labelled as a plasmid, but not marked as circular or complete.
Molecule Info - Organelle
The Organelle page allows the submitter to label sequences that represent organelles or fragments of organelles. If an organelle sequence has no gaps and represents the complete molecule, it should be marked as complete and circular (if appropriate). If a circular organelle sequence has gaps, but represents the complete molecule, it should still be marked as circular but not complete. If the sequence is just a fragment of the organelle it should still be labelled as an organelle, but not marked as circular or complete.
Annotation
The Annotation tab allows a submitter to import Feature Table files that describe the features (CDSs, rRNAs, tRNAs, misc_features, etc) on the sequences.
A submitter may also use the Submission->Import->Import 5 Column Feature Table and Submission->Import->Import GFF3 File menu items to import annotation files to be added to the sequences, or may use the items in the Submission->Features menu to add individual features to a sequence. More information about importing feature tables can be found in the Import Manual.
The Feature Table shown on this page can also be displayed using the Submission->Reports->Show Feature Table menu item. Еxisting features can be edited by clicking on the pen icon () in the left margin next to the Feature in the flatfile display of the sequence record in the Text View.
Reference
The Reference tab has two subpages, Sequence authors and Publication.
Reference - Sequence Authors
The Sequence authors page collects the names of the researchers associated with the sequencing and related analysis of the data.
Reference - Publication
The Publication page collects information about the paper, book chapter, thesis, etc associated with this submission, which can be Unpublished, In-press, or Published.
Publications can also be added to the sequence data using the Submission->Add Publication menu item.
Validation
The Validation page has two pages: Validate and Submitter Report. These reports are not run automatically. A submitter must click “Validate record” on the Validate page or “Refresh Submitter Report” on the Submitter Report page after finishing or making changes to a record to view new results.
Validation - Validate
The Validator can also be launched with the Submission->Reports->Validate menu item. More information about the Validator can be found in the Validator Manual.
Validation - Submitter Report
The Submitter Report can also be launched with the Submission->Reports->Show Submitter Report menu item. More information about the Submitter Report can be found in the Submitter Report Manual.
To complete a submission, click the “Finish” button on the Submitter Report page. If any information is missing from the submission, a Missing Fields report pop-up will list the missing items.
If no items are missing, a submitter is prompted to save the finished submission as an .asn file, which can then be further edited in Genome Workbench or submitted to GenBank.
For more information please see the full documentation for NCBI Genome Workbench Editing Package.
Current Version is 3.8.2 (released December 12, 2022)
General
Help
Tutorials
- Basic Operation
- Using Active Objects Inspector
- Configure tracks and track display settings
- Working with Non-Public Data
- Viewing Multiple Alignments and Trees
- Broadcasting
- Genes and Variation
- Generating and Viewing Sequence Overlap Alignment
- Working with BAM Files
- Loading Tabular Data
- Working with VCF Files
- Sequence View Markers
- Opening Projects in Genome Workbench
- Publication quality graphics (PDF/SVG image export)
- Editing in Genome Workbench
- Create Protein Alignments using ProSplign
- GFF-CIGAR export for alignments
- Exporting Tree Nodes to CSV
- Generic Table View
- Running BLAST search against custom BLAST databases
- Using Phylogenetic Tree
- Coloring methods in Multiple Alignment View
- Displaying translation discrepancies
- Searching in Genome Workbench
- Graphical View Navigation and Manipulation
- Using the Text View to Review and Edit a Submission
- BAM haplotype filtering
- Displaying new non-NCBI molecules with annotations
- Creating phylogenetic tree from precalculated multiple alignment
- Creating phylogenetic tree starting from search
- Video Tutorials
General use Manuals
- Tree Viewer Formatting
- Tree Viewer Broadcasting
- Genome Workbench Macro
- Query Syntax in Genome Workbench and Tree Viewer
- Multiple Sequence Aligners
- Running Genome Workbench over X Window System
NCBI GenBank Submissions Manuals
- Table of Contents
- Introduction
- Genome Submission Wizard
- Save Submission File
- Reports
- Import
- Sequences
- Add Features
- Add Publication
- Comments
- Editing Tools