|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on May 10, 2024 |
Title |
mouse_activating_frame+2_unsorted |
Sample type |
SRA |
|
|
Source name |
mESC
|
Organism |
Mus musculus |
Characteristics |
cell line: mESC genotype: wild-type cells with 7xTetO-pMYLPF-Puro-IRES-GFP reporter integrated into the expression-stable locus on Chr15, clonal cell line treatment: frame+2_unsorted
|
Treatment protocol |
The ORFtag viral constructs were derived from the ecotropic Retro-EGT construct13 that includes the sequence features necessary for the inverse-PCR protocol (detailed below and in Extended Data Fig. 1). Furthermore, the construct feature a constitutively active PGK promoter that drives the expression of a NeoR resistance gene separated from a tag by the IRES sequence. The tag contained either TetR with an N-terminally located nuclear localization signal (Activator screen, Repressor screen; Addgene, this study) or LambdaN domain (PTGR screen; Addgene, this study). Additionally, the tag includes a 2x GGGS-linker followed by the BC2-tag and 3xFLAG-tag. Finally, the ORFtag construct contains a consensus splice donor motif (GT) followed by a segment of the Hprt intron (chrX:53020400-53020556 +, mm10). To ensure tagging of genes in all three possible coding frames, three variants of the constructs were used that contain either 0, 1 or 2 additional nucleotides upstream of the consensus splice motif (GT), resulting in the following sequence: AAG-CAG-GT (frame 1), AAG-G-CAG-GT (frame 2) or AAG-GC-CAG-GT (frame 3) where AAG represents the last codon of the 3xFLAG-tag.
|
Growth protocol |
All experiments presented here were carried out in diploid mouse embryonic stem cells (mESCs) that were derived from originally haploid HMSc2 termed AN3-1213. The mESCs were cultivated without feeders in high-glucose-DMEM (Sigma-Aldrich) supplemented with 13.5% fetal bovine serum (Sigma-Aldrich), 2 mM L-glutamine (Sigma-Aldrich), 1x Penicillin-Streptomycin (Sigma-Aldrich), 1x MEM non-essential amino acid solution (Gibco), 1mM sodium pyruvate (Sigma-Aldrich), 50 mM β-mercaptoethanol (Merck) and in-house produced recombinant LIF. Virus packaging cell lines, Lenti-X 293T (Takara) and PlatinumE (Cell Biolabs), were grown according to the manufacturer’s instructions. All cell lines were cultured at 37°C and 5% CO2 and regularly tested for mycoplasma contamination.
|
Extracted molecule |
genomic DNA |
Extraction protocol |
Genomic locations of ORFtag integrations were mapped using modified inverse-PCR followed by next generation sequencing (iPCR-NGS) protocol8. Genomic DNA was prepared by lysing cell pellets in lysis buffer (10 mM Tris-HCl pH 8.0, 5 mM EDTA, 100 mM NaCl, 1% SDS, 0.5 mg/ml proteinase K) at 55°C overnight. Following a 2-hour RNase A treatment (Qiagen, 100 mg/ml, 1:1,000 dilution) at 37°C, two extractions using phenol:chloroform:isoamyl alcohol and one extraction using chloroform:isoamyl alcohol were carried out. The samples then underwent two separate digestion reactions (with up to 4 µg of genomic DNA) using NlaIII and MseI enzymes (NEB) at 37°C overnight, followed by purification using a Monarch PCR&DNA Cleanup Kit (NEB). Ring-ligation was carried out using T4 DNA ligase (NEB) at 16 °C overnight, followed by heat-inactivation (65°C, 15 min) and linearization using SbfI-HF (NEB) at 37°C for 2 h. The digests were then purified using a Monarch PCR&DNA Cleanup Kit (NEB) and amplified using firstly a nested PCR reaction with KAPA HiFi HotStart ReadyMix (Roche), and a specific primer pair (TGCAGGACCGGACGTGACTGGAGTTC*A, TGCAGGACGATGAGCAGAGCCAGAACC*A) for 16 cycles. After cleanup with AMPure XP Reagent (Beckman Coulter, 1:1 ratio beads:PCR), iPCR amplification was carried out with KAPA HiFi HotStart ReadyMix (Roche), and a specific primer pair (AATGATACGGCGACCACCGAGATCTACACGAGCCAGAACCAGAAGGAACTTGA*C, CAAGCAGAAGACGGCATACGAGAT [custom-barcode] GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT) for 18 cycles. Afterwards, amplified libraries were size selected for a range of 400-800 bp using SPRIselect beads (Beckman Coulter). NGS was performed on an Illumina NextSeq550 or llumina HiSeq 2500 sequencer according to the manufacturers’ protocols with custom first-read primer (1:1 mix of GAGTGATTGACTACCCGTCAGCGGGGGTCTTTCA and TGAGTGATTGACTACCCACGACGGGGGTCTTTCA).
|
|
|
Library strategy |
OTHER |
Library source |
genomic |
Library selection |
other |
Instrument model |
Illumina NextSeq 500 |
|
|
Data processing |
library strategy: ORFtag First, iPCR reads from sorted and background (non-selected, input) samples were trimmed using Trim galore (v0.6.0) with default parameters to remove Illumina adapters. Then, trimmed reads were aligned to the mm10 version of the mouse genome using bowtie220 (v2.3.4.2) with default parameters (for paired-end sequenced samples, only first mate reads were considered), before removal of duplicated and low mapping quality reads (mapq<=30) using samtools (v1.9)21. Mapped insertions were assigned to the closest downstream exon junction – with a maximum distance of 200kb – based on GENCODE annotations of the mouse genome (vM25). Finally, insertion counts were aggregated per gene. Of note, only exons from protein-coding transcripts were considered, except for the first exon of each transcript, which does not contain splicing acceptor sites. Consequently, intronless genes – for which none of the isoforms contain a spliced intron – were not considered. Finally, genes showing significantly more insertions in sorted samples compared to unsorted samples (using the same cassette) were identified using one-tailed fisher's exact test (alternative= "greater"). Of note, only genes with at least 3 unique insertions in sorted samples were considered. Obtained p-values were corrected for multiple testing using the FDR method and genes showing an FDR<0.001 and a log2 Odd Ratio≥1 were classified as hits. Assembly: mm10 Supplementary files format and content: counts_same_strand files contains the genomic coordinates of all the insertions that were found in the sample, and the id of the assigned gene. In addition, the distance between the insertion and the exon that was used for assignment, as well as its position (number) inside the gene, are reported. Supplementary files format and content: counts_rev_strand is similar to the previous (counts_same_strand) file, but reports the assignment of reversed insetions. Used to compute strand bias. Supplementary files format and content: The sorted_vs_unsorted.txt files contain, for each gene, the unique insertions enrichment in sorted vs unsorted samples (log2OR) with the associated FDR (padj). In this study, all genes with a FDR<0.001 were considered as hits (see "hit" column).
|
|
|
Submission date |
May 09, 2024 |
Last update date |
May 10, 2024 |
Contact name |
Alexander Stark |
E-mail(s) |
stark@starklab.org
|
Organization name |
The Research Institute of Molecular Pathology (IMP)
|
Lab |
Stark Lab
|
Street address |
Campus-Vienna-Biocenter 1
|
City |
Vienna |
ZIP/Postal code |
1030 |
Country |
Austria |
|
|
Platform ID |
GPL19057 |
Series (2) |
GSE225972 |
Proteome-scale tagging and functional screening in mammalian cells by ORFtag |
GSE267110 |
Proteome-scale tagging and functional screening in mammalian cells by ORFtag [frame_specific_screen] |
|
Relations |
BioSample |
SAMN41284582 |
SRA |
SRX24508984 |
Supplementary file |
Size |
Download |
File type/resource |
GSM8260022_mouse_activating_frame+2_unsorted_counts_rev_strand.txt.gz |
717.9 Kb |
(ftp)(http) |
TXT |
GSM8260022_mouse_activating_frame+2_unsorted_counts_same_strand.txt.gz |
719.2 Kb |
(ftp)(http) |
TXT |
SRA Run Selector |
Raw data are available in SRA |
|
|
|
|
|