NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM2747926 Query DataSets for GSM2747926
Status Public on Apr 06, 2018
Title cis input 1
Sample type SRA
 
Source name deep mutational scanning
Organism synthetic construct
Characteristics molecule: PCR amplicon
Growth protocol Yeast was grown in synthetic complete media -HIS at 30ºC (for input samples) and SC -HIS 1M NaCl at 37ºC (output samples). Cells were harvest, centrifuged and pellets saved at -20ºC for DNA extraction
Extracted molecule other
Extraction protocol The sequencing libraries were constructed by two consecutive PCR reactions using a method adapted from Levy et al 35. The first PCR was designed to amplify the region of interest, i.e. from directly upstream of the first mutated codon in FOS to directly upstream of the first mutated codon in JUN (the two genes are in head-to-tail orientation). The first PCR also added Unique Molecular Barcodes (UMIs) and the first half of the illumina adapter sequences. A small number of cycles of the first PCR would limit the number of differently barcoded molecules that derive from the same template molecule. The second PCR would then add the remainder of the Illumina adapter sequences. Plasmid concentrations in the total DNA extractions were first quantified by qPCR using primer pair OGD241-OGD242 that bind in ori region of the plasmid. For each six samples of each of the cis and trans libraries, respectively 4 and 48 PCRs were performed using Q5 Hot Start High-Fidelity DNA Polymerase (New England Biolabs) according to manufacturer’s protocol in 50 μL reactions with 4.2 x 107 molecules of plasmid from the DNA extraction, 25 pmol of primers OGD237 and OGD238, and a melting temperature of 66˚C (previously determined by temperature gradient), an extension time of 30 sec or 1 min, respectively, for 4 cycles. A total of 2 x 109 molecules of plasmids were then used to prepare the sequencing libraries from each sample. Excess primers were removed by adding 2 μL of ExoSAP-IT (Affymetrix) and incubating for 20 min at 37˚C followed by an inactivation for 15 min at 80˚C. This step was necessary because these 60 nt primers are not efficiently removed by the following column purification step. The 48 PCRs of each sample of the trans library were then pooled and purified using eight Qiagen PCR purification kit (Qiagen) columns per sample. According to manufacturer’s protocol, one column is able to bind 10 μg of DNA, which corresponds to ~8 x 108 genomes. Eight columns were then used to ensure that they are not saturated by the genomic DNA carried over from the DNA extraction. The DNA was eluted in 2 x 50 μL of EB buffer (provided by the manufacturer) and pooled for each sample. The eluted DNA was then split into 24 PCR reactions per, which were performed using Kapa HiFi HotStart DNA polymerase (Kapa Biosystems) according to manufacturer’s protocol in 50 μL with 15 pmol of illumina adapter primers. The reverse primers carried a different index for each of the six samples of the same library. For this PCR step, Kapa was chosen over Q5 because it was less efficient in the first PCR reactions (higher optimal melting temperature) and thus would lead to a lower re-barcoding of amplicons with new UMIs present on primers from the first PCR reaction that would have been carried over. Each sample was loaded on an agarose gel to check for correct amplification. A strong non-specific band of lower size was observed, which seemed to gradually disappear as the number of cycles in the first PCR was increased. However, increasing the number of cycles would increase the probability of producing amplicons with different UMIs that derived from the same original template molecule. The number of cycles in the first PCR was thus kept to four and the band of correct size was extracted by gel purification from each sample. To this end, the 24 PCRs of each sample were pooled, concentrated on four Qiagen PCR purification kit (Qiagen) columns per sample and each eluted with 2 x 50 μL of EB buffer (provided by the manufacturer). The bands of correct size were then purified on 2% agarose gel starting from 100 μL of each sample using 10 μL QIAEX II beads (Qiagen) according to manufacturer’s protocol and eluted in 20 μL EB buffer. DNA concentration was determined by picogreen in triplicates and the six samples were pooled at equimolar ratio. The pooled sample was sequenced in a single lane of an Illumina HiSeq2500 with 125 bp paired-end reads at the EMBL Genomics Core Facilities in Heidelberg, Germany.
 
Library strategy OTHER
Library source genomic
Library selection other
Instrument model Illumina HiSeq 2500
 
Description from plasmid library
TableS4.xslx
Data processing trans library filtering (sample 1-6): Paired reads for which one of the two variable regions had an average Phred score below or equal to 20 were also discarded. Additionally, paired reads were also filtered-out if they had i) one or more non-resolved bases (Ns) in the variable or UMI regions, ii) more than one mutated codon in each gene’s variable region or iii) if the mutated codon ended in an A or T (since these were not encoded by the NNS mutagenic primers). (custom perl script)
cis library (sample 7-12): Paired-end reads, which were overlapping over the full length of the variable region, were assembled using PEAR version 0.9.6 38 with a p-value threshold for correct assembly of 0.05, a maximum and minimum fragment length of 150 nt, and a minimum overlap size of 100 nt. These parameters force the assembly of sequences of the unique expected length. An average of 30% and 20% of the paired end reads from the three input and output libraries, respectively, were not assembled and filtered out. These included the adapter sequences and indels (resulting from the <100% coupling efficiency inherent to DNA synthesis). Additionally, assembled reads with Ns in any part of the sequence were filtered out.
variant calling (custom perl script)
UMIs counting (custom perl script)
Supplementary_files_format_and_content: Excel file with read count for each variants in each sample
 
Submission date Aug 21, 2017
Last update date May 15, 2019
Contact name Guillaume Diss
E-mail(s) guillaume.diss@gmail.com
Organization name CRG
Street address C/ Dr. aiguader, 88
City Barcelona
ZIP/Postal code 08003
Country Spain
 
Platform ID GPL19604
Series (1)
GSE102901 The genetic landscape of a physical interaction
Relations
BioSample SAMN07525988
SRA SRX3110834

Supplementary data files not provided
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap