Clone Finder provides a clone-centric interface that allows identification of clones aligned to any assembly; searches can be filtered to find clones based on population/strain and library type. Clone Finder is available for any organism that has a on the Map Viewer home page.
In order to use Clone Finder, you must first specify a region of interest on a particular assembly. You may search Region by Position using genomic position (in basepair coordinates) or Region by Feature specifying the name of genes, clones, SNPs, markers or transcripts. Context-specific help, in the form of moveable pop-up text boxes, can be accessed for each section by clicking on the on the right of each title bar. Holding down left click anywhere in the box will allows you to move the box around the screen and left click on the on the top right of the box will close it.
Specify Region by Position:
or
Specify Region by Feature:
If the region is specified by position, only a single region will be returned listing assembly, chromosome, chromosomal coordinates and length. However, specifying a region by feature may return more than one region (clicking on the next to the chromosome number displays an information box listing the number of hits by feature name and type in the region). In this case, the region of interest must be selected in order to find clones.
For each region there will be a set of filters that can be used to refine the data displayed. The filters are separated into different categories. The category label is specified in the light grey title bar. All filters in a section can be selected by clicking the Check all link on the right of the title bar. All filters in a section can be de-selected by clicking the Clear all link on the right of the title bar. The entire section can be hidden (or re-opened) by clicking the inverted triangle next to the section title. For all selections, items with their text grey are not available for the selected region.
In some cases we may have clone placement information from more than one source. In these cases, we group the placements into datasets. Selection or de-selection of items in this section will automatically affect selections in the Library Selection section.
Genomic libraries are typically derived from either normal genomic DNA or a cancer source. Selection or de-selection of items in this section will automatically affect selections in the Library Selection section.
In many cases, population or strain data is associated with a given library. Selection or de-selection of items in this section will automatically affect selections in the Library Selection section.
Individual libraries are grouped into BAC, Fosmid and PAC vectors and can be selected or de-selected. Mouse-over the library name for further details including library name, library id, population or strain and DNA source.
Clicking on Find Clones will take you to the results page for the region selected.
The top left of the Clone Finder results page lists the assembly, chromosome, link to contig sequence, and chromosomal coordinates for the region you selected.
By default the Data summary is minimized but can be expanded by clicking on the on the right of the Data Summary title bar. This view displays the list of libraries selected from the search page and includes the following information:
These columns can be selected/de-selected by hovering over any column header and clicking on the then hovering over the columns tag which will reveal a list of tick box column heading options to select from. The resulting columns can be dragged and dropped, by clicking on their column headers, to rearrange their order.
Below the Data summary is a graphical display of the region of interest with the chromosomal coordinates along the top and Map Viewer tracks for the following features:
Mouse-over the feature image to display feauture name and left click on the feature image to return a pop-up text box containing feature details eg:
These pop-up boxes can be moved around the screen by clicking on and holding on to their title bar, and minimized by clicking the or closed by clicking the , both on the top right of the pop-up.
The Download: Image button, on the right of the graphical display title bar, downloads a png file for the displayed image.
The Download: Excel button, on the right of the graphical display title bar, downloads an Excel file for the displayed clone data.
Below the Map Viewer tracks the graphical data for each library is displayed. By default the concordant clones are displayed if they occupy less than 400 pixels, otherwise they are displayed as a histogram, and discordant clones are not displayed. For each library the concordant and discordant display can be edited using the drop-down menu options on the left of the title bar. Clicking ' Off' will hide the track, ' Features' will display each clone and ' Histogram' will display the histogram. For concordant clones dark blue represents the clone end alignments with the light blue dotted line connecting the two ends of each clone. For discordant clones dark red represents the clone end alignments with the light red dotted line connecting the two ends of each clone.
Mouse-over the clone image to display the clone name and left click on the clone image to return a pop-up text box containing clone details:
These pop-up boxes can be moved around the screen by clicking on and holding on to their title bar, and minimized by clicking the or closed by clicking the , both on the top right of the pop-up.
By default the Library tables are not displayed - click on the on the left of the Clone title bar to display the library-specific data table containing the following information:
Mouse-over any column header and click on the to reveal sorting options (if available) and, under Columns, a list of tick box column heading options to select for display. The columns can be dragged and dropped, by clicking on their column headers, to rearrange their order.
Clicking on the + sign on the left of any clone listed in the Library table expands the view to provide details as seen in the pop-up boxes for the clone in the Graphical display:
Only the first 15 rows of each table are displayed. Click the forward and back arrows on the bottom left of each library table to browse through the pages or add or subtract 5 rows by clicking on the +5 and -5 respectively. Each Library table can be minimized by clicking on the on the right of the library title bar.
The Download: Excel button, also on the right of the Library table title bar, allows you to download the an Excel file for each library.
The Excel files contain three rows for each clone listed in the Library table:
Sequences, quality scores and end information (such as trace name, trace strand, template name) of clone end sequences are retrieved from dbGSS or TraceArchive DB server. After quality clipping, vector clipping, and Windowmasking (Morgulis, A. et. al.) cleaned sequences are stored in fasta files for further analysis.
Cleaned end sequences are aligned to contigs using an in-house BLAST-guided global alignment tool. This involves using megaBLAST (Zhang, Z. et. al.) to align the cleaned, masked clone ends to the contigs using “-F T -W 28 -r 1 -q -3 -Z 200 -e 1e-10 -U T” as default parameters. High identity, contained or dovetailed alignment are accepted as valid alignments. High identity but non-contained or dovetailed alignments are grouped by their subject_id (ie contig’s gi) into Seq_align_set for global alignment. The Seq_align(s) with the highest cov_pct can be only reported if no alignment from banded alignment has higher cov_pct. Only Seq_align_set(s) which at least 20% input sequence is covered by a contig will be tested in banded alignment.
The best local alignment extracted from global alignment, which is high identity, half-dovetailed or close to either contig end, and has higher coverage the best covered blast generated alignment will be reported. Otherwise, Blast generated alignment of the highest cov_pct will be reported. Only alignments with greater than 80% coverage will be used for clone placement.
Mean and Standard Deviation of clone size are calculated based on the clones meet the following requirements:
Concordant is defined as clones meeting the following requirements:
A clone is a Tie;concordant if it can be placed as concordant to multiple loci.
A Discordant can be an insertion, a deletion or an inversion: