Support for Genome Workbench will end on March 31 2024. You may still use the application, but supporting documentation will not be available after this date. Read more.
Query Syntax in Genome Workbench and Tree Viewer
Queries in Genome Workbench and Tree Viewer are designed to work with structured data that is organized into a set of records with named fields. Two types of queries are supported, character string based search and query-based search. String based search searches all the fields for a single character string value. Query-based search is more like SQL and uses true/false (Boolean) expressions as the basis for selection.
Character String Search:
To perform a string-based search enter a single word with no blanks, or separated words in quotes, e.g.:
Listeria “Listeria monocytogenes”
The above queries work similarly to how a search in Google works – it searches all the fields in all the records of the data source and returns any entries where the above character string(s) appear. Note that, unlike Google, only one character string may be provided. The following queries are not valid:
Listeria blood “GCA_000810445.1” “Bodily Fluid”
Queries:
To search based on multiple criteria an SQL-like syntax is also supported. Queries are made up of one or more connected true/false (Boolean) expressions. These expressions are built up from comparison operators that compare fields in the data source to other fields or values provided in the query. The below comparison operators will work with either character, numeric or Boolean (true/false) values:
- = (equals)
- < (less than)
- > (greater than)
- <= (less than or equal to)
- >= (greater than or equal to)
- Like (equals where ‘?’ matches any single character and ‘*’ matches zero or more characters)
- Between (check if a value is between two other values)
- In (check if a value is equal to any in a list of values)
Any comparison in the query returns a Boolean value. A query can combine comparisons into more complex expressions using the following Boolean operators, which only work with Boolean values:
- AND (True only if both expressions are true)
- OR (True if either or both expressions are true)
- XOR (True if one expression is true and the other one is false)
- NOT (True if the associated expression is false)
Note that the keywords Like, Between, In, AND, XOR and NOT are not case sensitive – they can be lower case, upper case, or a mixture of both. Also expressions can be grouped together using parentheses to create different logical expressions, e.g.:
A and (B or C) True if A is true and B or C is true (A and B) or C True if both A and B are true or C is true
Example Queries:
For our example we will assume we have a phylogenetic tree with six leaf nodes and each of these nodes has properties (fields) that match one of the rows in the following table. The queries below show how to use the query language to select specific nodes.
row | dist | asm_acc | collection_date | geo_loc_name | isolation_source | scientific_name |
---|---|---|---|---|---|---|
1 | 0.0946 | GCA_000833875.1 | 2009-05-27 | Italy: Rimini | blood | Haemophilus influenzae |
2 | 0.1008 | GCA_000633915.1 | 2009-10-02 | Indian Ocean | deep sea sediment | Pseudomonas monteilii |
3 | 0.1396 | GCA_000409525.1 | 2009-08-28 | India | wetland sediment | Bacillus sp. Aph1 |
4 | 0.0007 | GCA_000810445.1 | 2009-06-08 | USA | Bodily fluid | Acinetobacter baumannii |
5 | 0.0419 | GCA_000729665.1 | 2009-06-12 | USA | Food Associated Environment | Listeria monocytogenes |
6 | 0.03 | GCA_000823885.1 | 2009-06-19 | United Kingdom: England | N/A | Legionella pneumophila |
Example 2: All entries that were taken in part of the UK: geo_loc_name like United Kingdom:* Result: (row 6)
Example 3: All entries that were collected from the USA or India geo_loc_name in (India, USA) Result: (rows 3,4,5)
Example 4: All entries for samples taken from sediment: isolation_source LIKE *sediment* Result: (rows 2,3)
Example 5: All entries whose distance to their parent is between 0.09 and 0.11 dist Between 0.09 and 0.11 Result: (rows 1,2)
Example 6: All entries with samples not from the USA and whose isolation source was not a sediment not geo_loc_name = USA and not isolation_source LIKE *sediment* Result: (rows 1,6)
Example 7: All entries that were collected in June 2009 or are of species Bacillus collection_date like 2009-06* OR scientific_name like Bacillus* Result: (rows 3,4,5,6)
Example 8: All entries with samples from the US or of species Acinetobacter baumannii but not both geo_loc_name = USA XOR scientific_name = “Acinetobacter baumannii” Result: (row 1)
Example 9: All entries for Listeria or Bacillus that were collected in June 2009 (scientific_name like Listeria* OR scientific_name like Bacillus*) AND collection_date like 2009-06* Result: (row 5)
Current Version is 3.8.2 (released December 12, 2022)
General
Help
Tutorials
- Basic Operation
- Using Active Objects Inspector
- Configure tracks and track display settings
- Working with Non-Public Data
- Viewing Multiple Alignments and Trees
- Broadcasting
- Genes and Variation
- Generating and Viewing Sequence Overlap Alignment
- Working with BAM Files
- Loading Tabular Data
- Working with VCF Files
- Sequence View Markers
- Opening Projects in Genome Workbench
- Publication quality graphics (PDF/SVG image export)
- Editing in Genome Workbench
- Create Protein Alignments using ProSplign
- GFF-CIGAR export for alignments
- Exporting Tree Nodes to CSV
- Generic Table View
- Running BLAST search against custom BLAST databases
- Using Phylogenetic Tree
- Coloring methods in Multiple Alignment View
- Displaying translation discrepancies
- Searching in Genome Workbench
- Graphical View Navigation and Manipulation
- Using the Text View to Review and Edit a Submission
- BAM haplotype filtering
- Displaying new non-NCBI molecules with annotations
- Creating phylogenetic tree from precalculated multiple alignment
- Creating phylogenetic tree starting from search
- Video Tutorials
General use Manuals
- Tree Viewer Formatting
- Tree Viewer Broadcasting
- Genome Workbench Macro
- Query Syntax in Genome Workbench and Tree Viewer
- Multiple Sequence Aligners
- Running Genome Workbench over X Window System
NCBI GenBank Submissions Manuals
- Table of Contents
- Introduction
- Genome Submission Wizard
- Save Submission File
- Reports
- Import
- Sequences
- Add Features
- Add Publication
- Comments
- Editing Tools