U.S. flag

An official website of the United States government

Format

Send to:

Choose Destination

SRX24414623: GSM8244562: sparse_DTS_cores_output_2; Saccharomyces cerevisiae; OTHER
1 ILLUMINA (NextSeq 2000) run: 35.6M spots, 10.7G bases, 3.1Gb downloads

External Id: GSM8244562_r1
Submitted by: Ben Lehner, Systems and synthetic biology, CRG
Study: Genetics, energetics and allostery during a billion years of hydrophobic protein core evolution
show Abstracthide Abstract
Protein folding is driven by the burial of hydrophobic amino acids in a tightly-packed core that excludes water. The genetics, biophysics and evolution of hydrophobic cores are not well understood, in part because of a lack of systematic experimental data on sequence combinations that do - and do not - constitute stable and functional cores. Here we randomize protein hydrophobic cores and evaluate their stability and function at scale. The data show that vast numbers of amino acid combinations can constitute stable protein cores but that these alternative cores frequently disrupt protein function because of allosteric effects. These strong allosteric effects are not due to complicated, highly epistatic fitness landscapes but rather, to the pervasive nature of allostery, with many individually small energy changes combining to disrupt function. Indeed both protein stability and ligand binding can be accurately predicted over very large evolutionary distances using additive energy models with a small contribution from pairwise energetic couplings. As a result, energy models trained on one protein can accurately predict core stability across hundreds of millions of years of protein evolution, with only rare energetic couplings that we experimentally identify limiting the transplantation of cores between highly diverged proteins. Our results reveal the simple energetic architecture of protein hydrophobic cores and suggest that allostery is a major constraint on sequence evolution. Overall design: We built combinatorial libraries in the hydrophobic cores of three small protein domains (FYN-SH3, CI-2A and CspA) using a reduced alphabet consisting of the amino acids F, L, M, V, I encoded by the DTS degenerate codon. By bottlenecking and pooling the libraries, in the sparse_DTS_core_mutagenesis experiment we sparsely measured the intracellular abundance of protein variants in yeast cells using abundancePCA, a protein complementation assay that couples cell growth rate with query protein intracellular abundance under selection by methotrexate. For the SH3 domain of the human FYN kinase, we selected a few query core amino acid combinations that are severely deleterious in abundance fitness and designed a suppressor "permissivity" library by introducing non-core mutations associated with SH3 domains naturally carrying such query core combinations that are deleterious in FYN (FYN-SH3_core_permissivity experiment). Also for FYN-SH3, we assessed the impact of core reconfiguration in function by measuring the binding to its short linear motif ligand PRD1super using bindingPCA, a protein complementation assay that couples cell growth rate with query variant intracellular binding to an interacting partner under selection by methotrexate (FYN-SH3_core_DTS_binding experiment).
Sample: sparse_DTS_cores_output_2
SAMN41143313 • SRS21169933 • All experiments • All runs
Library:
Name: GSM8244562
Instrument: NextSeq 2000
Strategy: OTHER
Source: GENOMIC
Selection: other
Layout: PAIRED
Construction protocol: Input and output culture cell pellets were resuspended in DNA extraction buffer (10 mM Tris-HCl, 100 mM NaCl, 2% Triton-X, 1% SDS, pH 8). Volumes were scaled depending on the volume of cell pellets to process. For pellets from 100 mL cultures harvested at OD600 1.6, 1 mL of DNA extraction buffer was used. Cells were lysed by cyclically freezing in liquid nitrogen followed by incubation in a 62ºC water bath twice. Total DNA was extracted by adding 1 mL of a 25:24:1 phenol:chloroform:isoamyl alcohol mixture (Merck, Darmstadt, Germany) along with 1 g of acid washed glass beads (Sigma-Aldrich, Saint Louis, MO, US) and vortexing for 10 min. The mixture was centrifuged at room temperature for 30 min at 3,300 g and the aqueous phase (upper layer) was recovered in a separate tube. 1 mL of the phenol:chloroform:isoamyl alcohol was added to it and the mixture was vortexed for 2 min followed by 45 min centrifugation at 3,300 g. The aqueous phase was again transferred to a new tube and mixed with 0.1 volumes of 3 M sodium acetate and 2.2 volumes of pure ethanol previously cooled down at -80ºC. The mixture was kept at -20 ºC for 30 min and spun down at 3,300 g for 30 min at 4ºC. The DNA pellets were dried overnight in a fume extraction hood. Plasmid DNA from total DNA extracts was quantified by qPCR using the oGJJ152-oGJJ153 oligo pair that anneals at the origin of replication of all pGJJ vector series. The qPCR reaction was performed in a LightCycler 400 instrument (Roche, Basel, Switzerland) using the SYBR green qPCR 2X master mix from Thermo Fisher (Waltham, MA, US). Quantification was achieved by measuring a standard curve on a serial dilution of a control sample of known concentration in the same qPCR run as the query samples, and subtracting blank measurements. Library preparation for sequencing in Illumina instruments followed a two-step PCR protocol. Frameshifts and a segment of the Illumina adapter overhangs were introduced in PCR1. For each sample, 8x50 μL PCR1 reactions were performed using 31.25 million plasmid molecules as a template as quantified by qPCR. PCR1 used the oGJJ595 and the oGJJ748 frameshifting oligo pools and consisted of 11 cycles. Next, all reactions corresponding to the same sample were pooled and treated with 16 μL ExoSAP for 30 min at 37ºC, then for 20 min at 80ºC. Samples were cleaned-up using MinElute columns, eluted in 30 μL of pre-warmed water (55ºC) and 1.5 μL were used as template for each PCR2 reaction.16x50 μL PCR2 reactions were performed per sample to introduce the rest of the Illumina adapters and the dual index barcodes. PCR2 reactions corresponding to the same sample were pooled, concentrated using a MinElute column, and run in a 2% agarose gel. Bands of the correct sizes were excised from the gel and sequencing-ready DNA was recovered by using a centrifugal filter unit followed by MinElute clean-up. Samples were quality controlled by the CRG Genomics core facility using a TapeStation instrument (Agilent, Santa Clara, CA, US), quantified by qPCR and sequenced in either a NextSeq500 (FYN SH3 permissivity library) or NextSeq2000 (subsampled core DTS library pool and FYN SH3 binding library) instrument (Illumina, San Diego, CA, US). Combinatorial core libraries using a reduced alphabet of hydrophobic amino acids (F, L, I, M, V) were built by multiple segment Gibson assembly of oligonucleotides bearing the DTS degenerate codon at core residue-encoding positions (oligos used in this study were all obtained from IDT, Coralville, IA, US). Each of the genes encoding the FYN SH3 domain, CI-2A and CspA were codon optimized for expression in Saccharomyces cerevisiae and divided in two segments smaller than 200 bp each, which was the size limit for commercial synthetic degenerate oligo production at the time of performing this study. Both the 5' and the 3' gene segments contained a homology region of at least 25 bp to the linearized pGJJ191 mutagenesis vector, plus an internal homology region of at least 25 bp. Double stranded Gibson assembly inserts were produced from single stranded commercial oligonucleotides via a single cycle PCR reaction using a 3' annealing reverse primer specific to each segment (see Table ). The pGJJ191 vector was linearized via a 25-cycle PCR reaction using primers oGJJ311 and oGJJ406, followed by 1h DpnI (NEB) treatment at 37ºC plus inactivation at 80ºC for 20 minutes (1 μL enzyme per 50 μL PCR reaction). Both inserts and vector PCR reactions were cleaned-up in MinElute columns (Qiagen, Hilden, Germany). Gibson assembly mixtures contained 300 ng of linearized pGJJ191 plus the two insert segments at a 5:1 molar ratio each, along with a Gibson enzyme mixture produced in house by the Protein Technologies core facility at the CRG. Conversely, the FYN SH3 permissivity library was purchased as an oligo pool from Twist Bioscience (San Francisco, CA, US). The oligo pool was amplified in a 13 cycle PCR reaction with primers oAE018 and oAE227, which was cleaned up using a MinElute column. The insert amplified oligo pool contained 25 bp homology regions at both ends and was mixed at a 5 to 1 molar ratio with 300 ng of linearized pGJJ191 and subsequently with the Gibson enzyme mixture. All Gibson assembly reactions were incubated at 50ºC for 8h, dialyzed against water using 0.025 μm filters (Millipore, Burlington, MA, US) for 1.5h, concentrated using a SpeedVac and transformed into E. coli C3020 cells (NEB, Ipswich, MA, US). The number of transformants was assessed by colony count in serial dilutions of transformation outgrowth aliquots in LB plates supplemented with 50 μg/mL spectinomycin, and 1 to 5 equivalent transformation reactions were pooled to reach at least 20X variant coverage per library as required. Plasmid DNA was extracted using the Qiagen Plasmid Plus Midi kit and the library inserts were obtained through double endonuclease digestion (HindIII-HF and NheI-HF, NEB) followed by agarose gel purification achieved by a combined use of centrifugal filter units (Millipore) and MinElute clean-up. Inserts were introduced in double-restricted, Quick CIP-trated (NEB) pGJJ162 (abundancePCA) or pGJJ159-PRD1super (bindingPCA) vectors by overnight thermal-cyclic ligation using T4 DNA ligase. The pGJJ159-PRD1super vector was previously obtained by introducing the S. cerevisiae codon-optimized PRD1super encoding gene from commercial oligo synthesis (IDT) into BamHI-SpeI double restriction-linearized pGJJ159 via Gibson assembly. Ligation mixtures were dialyzed against water, SpeedVac-concentrated and transformed into E. coli C3020 electrocompetent cells. Equivalent transformations were combined to reach at least 20X variant coverage, and selection-ready libraries in PCA vectors were Midi-prep extracted. For sparse sampling of combinatorial core libraries (FYN SH3, CI-2A and CspA in pGJJ162), the full-complexity libraries were transformed into E. coli C3020 electrocompetent cells and bottlenecked by outgrowth medium serial dilution into overnight selective medium (LB supplemented with 50 μg/mL ampicillin) assessed in parallel through colony count of plated aliquots, aiming at ~10,000 transformants per library. Bottlenecked libraries were extracted by mini-prep, qPCR quantified and mixed to equivalent molar ratios.
Runs: 1 run, 35.6M spots, 10.7G bases, 3.1Gb
Run# of Spots# of BasesSizePublished
SRR2885432135,561,11110.7G3.1Gb2024-05-13

ID:
32730514

Supplemental Content

Search details

See more...

Recent activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...