 |
 |
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Aug 10, 2023 |
Title |
PerturbSci-Kinetics pooled screen |
Sample type |
SRA |
|
|
Source name |
cell line
|
Organism |
Homo sapiens |
Characteristics |
cell line: HEK293-idCas9
|
Treatment protocol |
On day7, before the end of the screen, 6ml of the original media from each plate was mixed with 6uL of 200mM 4sU dissolved in DMSO and was put back for nascent RNA metabolic labeling. After 2 hours of treatment, cells were harvested for single-cell PerturbSci-Kinetics profiling.
|
Growth protocol |
HEK293-idCas9 cells were transduced with a sgRNA library at a low MOI and were selected by Puromycin. Then the dCas9-KRAB-MeCP2 expression was induced by adding Dox to the culture medium. Cells were cultured for additional 7 days and were passed every other day with 4000x coverage/sgRNA.
|
Extracted molecule |
polyA RNA |
Extraction protocol |
After trypsinization, cells in each 10cm dish were collected into a 15ml falcon tube and kept on ice. Cells were spun down at 300xg for 5 minutes (4 °C) and washed once in 3ml ice-cold PBS. Cells were fixed with 5ml ice-cold 4% PFA in PBS for 15 minutes on ice. PFA was then quenched by adding 250ul 2.5M Glycine, and cells were pelleted at 500xg for 5 minutes (4 °C). Fixed cells were washed once with 1ml PBSR (PBS, 0.1% SUPERase In, and 10mM dithiothreitol (DTT)), and were then resuspended, permeabilized, and further fixed in 1ml PBSR-triton-BS3 (PBS, 0.1% SUPERase In, 0.2% Triton-X100 (Sigma X100-500ML), 2mM bis(sulfosuccinimidyl)suberate (BS3), 10mM DTT) for 5 minutes. Additional 4ml of PBS-BS3 (PBS, 2mM BS3, 10mM DTT) was then added, and cells were incubated on ice for 15 minutes. Cells were pelleted at 500xg, 4 °C for 5 minutes and resuspended in 500ul nuclease-free water supplemented with 0.1% SUPERase In and 10mM DTT. 3ml of 0.05N HCl was added for further permeabilization. After 3 minutes of incubation on ice, 3.5ml Tris-HCl, pH 8.0, and 35ul of 10% Triton X-100 were added to each tube to neutralize the HCl. After spinning down at 4 °C, 500xg for 5 minutes, cells were finally resuspended in 400ul PSB-DTT at the concentration of ~2e6 cells/100ul (PBS, 1% SUPERase In, 1% Bovine Serum Albumin (BSA), 1mM DTT), mixed with 10% DMSO, and were slow-frozen and stored in -80 °C. Fixed cells were thawed, then mRNA and sgRNA within each individual cell were indexed by indexing reverse transcription, indexing ligation, and finally indexing PCR. SgRNA libraries were further separated and enriched by another round of PCR. A step-by-step experimental protocol of PerturbSci-Kinetics library preparation is included in the supplementary material of the publication.
|
|
|
Library strategy |
RNA-Seq |
Library source |
transcriptomic |
Library selection |
cDNA |
Instrument model |
NextSeq 1000 |
|
|
Description |
PerturbSci-Kinetics library
|
Data processing |
For PerturbSci-Kinetics transcriptome reads processing and whole-transcriptome/nascent transcriptome gene counting, the pipeline was developed based on EasySci10 and Sci-fate14 with minor modifications. After demultiplexing on index 7, Read1 were matched against a constant sequence on the sgRNA capture primer to remove unspecific priming, and cell barcodes and UMI sequences sequenced in Read1 were added to the headers of the fastq files of Read2, which were retained for further processing. After potential polyA sequences and low-quality bases were trimmed from Read2 by Trim Galore55, reads were aligned to a customized reference genome consisting of a complete hg38 reference genome and the dCas9-KRAB-MeCP2 sequence from Lenti-idCas9-KRAB-MECP2-T2A-mCherry-Neo using STAR. Unmapped reads and reads with mapping score < 30 were filtered by samtools57. Then deduplication at the single-cell level was performed based on the UMI sequences and the alignment location, and retained reads were split into SAM files per cell. These single-cell sam files were converted into alignment tsv files using the sam2tsv function in jvarkit58. Only reads with FLAG values of 0 or 16 and high-quality mismatches with QUAL scores > 45 and CIGAR of M in them were maintained. All mutations were transformed onto the plus strand and were further filtered against background SNPs called by VarScan using our in-house EasySci data on HEK293 cells. Reads in which at least 30% of mutations were T to C mismatches were identified as nascent reads, and the list of reads were extracted from single-cell whole transcriptome sam files by Picard59. Finally single-cell whole transcriptome gene x cell count matrix and nascent transcriptome gene x cell count matrix were retrieved by assigning reads to genes if the aligned coordinates overlapped with the gene locations on the genome. At the same time, single cell exonic/intronic read numbers were also counted by checking whether reads were mapped to the exonic or the intronic regions of genes. To quantify dCas9-KRAB-MECP2 expression, a customized gtf file consisting of the complete hg38 genomic annotations and additional annotations for dCas9 was used in this step. Read1 and read2 of PerturbSci-Kinetics sgRNA libraries were matched against constant sequences respectively with a maximum of 1 mismatch allowed. For each filtered read pair, cell barcode, sgRNA sequence, and UMI were extracted from designed positions. Extracted sgRNA sequences with a maximum of 1 mismatch from the sgRNA library were accepted and corrected, and the corresponding UMI was used for deduplication. Duplicates were removed by collapsing identical UMI sequences of each individual corrected sgRNA under a unique cell barcode. Cells with overall sgRNA UMI counts higher than 10 were maintained and the sgRNA x cell count matrix was constructed. For compatibility with the GEO paired-end sequencing database, Index5 barcodes of each read pair originally in separate fastq files have been attached to the beginning of headers. Assembly: human reference genome (hg38) with two additional customized lentiviral sequences Supplementary files format and content: Processed data files includes 3 mtx files consisting of the whole transcriptome gene count sparse matrix of on-target cells, the nascent transcriptome gene count sparse matrix of on-target cells, and the sgRNA count sparse matrix of on-target cells. The meta-data of these on-target cells are stored in a csv file. Supplementary files format and content: on_target_whole_tx_count_matrix.mtx: the whole transcriptome gene count sparse matrix of on-target cells Supplementary files format and content: on_target_nascent_tx_count_matrix.mtx: the nascent transcriptome gene count sparse matrix of on-target cells Supplementary files format and content: on_target_sgRNA_count_matrix.mtx: the sgRNA count sparse matrix of on-target cells Supplementary files format and content: on_target_cell_metadata.csv: the meta-data of on-target cells Supplementary files format and content: whole_txome_sgRNA_sample_name_barcode_table.csv: the cross-reference table of whole transcriptome fastq file names and sgRNA readout fastq file names Supplementary files format and content: on_target_whole_tx.Genes.tsv: the tsv file including the rownames (gene symbols) annotation for on_target_whole_tx_count_matrix.mtx Supplementary files format and content: on_target_nascent_tx.Genes.tsv: the tsv file including the rownames (gene symbols) annotation for on_target_nascent_tx_count_matrix.mtx Supplementary files format and content: on_target_sgRNA.Genes.tsv: the tsv file including the rownames (gene symbols) annotation for on_target_sgRNA_count_matrix.mtx Supplementary files format and content: on_target_whole_tx.Barcodes.tsv: the tsv file including the colnames (cell barcodes) annotation for on_target_whole_tx_count_matrix.mtx Supplementary files format and content: on_target_nascent_tx.Barcodes.tsv: the tsv file including the colnames (cell barcodes) annotation for on_target_nascent_tx_count_matrix.mtx Supplementary files format and content: on_target_sgRNA.Barcodes.tsv: the tsv file including the colnames (cell barcodes) annotation for on_target_sgRNA_count_matrix.mtx
|
|
|
Submission date |
Nov 22, 2022 |
Last update date |
Aug 10, 2023 |
Contact name |
Zihan Xu |
E-mail(s) |
zxu@rockefeller.edu
|
Organization name |
The Rockefeller University
|
Lab |
Junyue Cao Lab
|
Street address |
1230 York Ave
|
City |
New York |
State/province |
NY |
ZIP/Postal code |
10065 |
Country |
USA |
|
|
Platform ID |
GPL30882 |
Series (1) |
GSE218566 |
PerturbSci-Kinetics: Dissecting key regulators of transcriptome kinetics through scalable single-cell RNA profiling of pooled CRISPR screens |
|
Relations |
BioSample |
SAMN31840338 |
SRA |
SRX18352309 |
Supplementary file |
Size |
Download |
File type/resource |
GSM6752591_on_target_cell_metadata.csv.gz |
5.6 Mb |
(ftp)(http) |
CSV |
GSM6752591_on_target_nascent_tx.Barcodes.tsv.gz |
475.4 Kb |
(ftp)(http) |
TSV |
GSM6752591_on_target_nascent_tx.Genes.tsv.gz |
225.9 Kb |
(ftp)(http) |
TSV |
GSM6752591_on_target_nascent_tx_count_matrix.mtx.gz |
117.4 Mb |
(ftp)(http) |
MTX |
GSM6752591_on_target_sgRNA.Barcodes.tsv.gz |
475.4 Kb |
(ftp)(http) |
TSV |
GSM6752591_on_target_sgRNA.Genes.tsv.gz |
1.8 Kb |
(ftp)(http) |
TSV |
GSM6752591_on_target_sgRNA_count_matrix.mtx.gz |
1.2 Mb |
(ftp)(http) |
MTX |
GSM6752591_on_target_whole_tx.Barcodes.tsv.gz |
475.4 Kb |
(ftp)(http) |
TSV |
GSM6752591_on_target_whole_tx.Genes.tsv.gz |
225.9 Kb |
(ftp)(http) |
TSV |
GSM6752591_on_target_whole_tx_count_matrix.mtx.gz |
389.9 Mb |
(ftp)(http) |
MTX |
GSM6752591_whole_txome_sgRNA_sample_name_barcode_table.csv.gz |
1.4 Kb |
(ftp)(http) |
CSV |
SRA Run Selector |
Raw data are available in SRA |
Processed data provided as supplementary file |
|
|
|
|
 |