GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
Series GSE132044 Query DataSets for GSE132044
Status Public on Jun 01, 2019
Title Systematic comparative analysis of single cell RNA-sequencing methods
Organisms Homo sapiens; Mus musculus
Experiment type Expression profiling by high throughput sequencing
Summary A multitude of single-cell RNA sequencing methods have been developed in recent years, with dramatic advances in scale and power, and enabling major discoveries and large scale cell mapping efforts. However, these methods have not been systematically and comprehensively benchmarked. Here, we directly compare seven methods for single cell and/or single nucleus profiling from three types of samples – cell lines, peripheral blood mononuclear cells and brain tissue – generating 36 libraries in six separate experiments in a single center. To analyze these datasets, we developed and applied scumi, a flexible computational pipeline that can be used for any scRNA-seq method. We evaluated the methods for both basic performance and for their ability to recover known biological information in the samples. Our study will help guide experiments with the methods in this study as well as serve as a benchmark for future studies and for computational algorithm development.
Overall design We systematically and directly compared seven single cell RNA-sequencing methods, including two low-throughput plate-based methods (Smart-seq2 and CEL-Seq2) and five high-throughput methods (10x Chromium (v2, v3), Drop-seq, Seq-Well, inDrops, and sci-RNA-seq), producing expression profiles from ~92,000 cells (nuclei) overall. We tested three sample types – a mixture of human and mouse cell lines, human peripheral blood mononuclear cells (PBMCs), and mouse cortex, each sample with two replicates – to generate a total of 36 different single cell RNA-sequencing libraries. For mouse cortex, we tested four single nucleus RNA-sequencing methods (Smart-seq2, 10x Chromium (v2), DroNc-seq, and sci-RNA-seq). We tested each sample type in two experiments (Mixture1 and Mixture2, PBMC1 and PBMC2, Cortex1 and Cortex2) run on different days to assess reproducibility. In each comparison experiment, we started with one sample with processing of aliquots starting at the same time for each method. The only exceptions were for Seq-Well in PBMC1, in which we thawed an identical PBMC aliquot a second time to obtain a Seq-Well dataset with sufficient cells profiled for PBMCs, and for 10x Chromium in PBMC1, in which we thawed an identical aliquot to directly compare version 2 (v2) with version 3 (v3). In each experiment, we aimed to collect data from ~350 cells for the low-throughput methods and ~3,000 cells for the high-throughput methods. In each experiment, we also used an aliquot of cells to generate a bulk RNA-sequencing library as a control. We sequenced all libraries together in an attempt to avoid batch effects due to varying sequence quality among Illumina flowcell lanes, with the following exceptions. We sequenced the inDrops libraries separately because they have an opposite read structure from those generated with the other methods. We performed additional sequencing for some libraries in an attempt to sequence similar numbers of reads per cell for each low or high throughput method. We aimed for 50,000 to 100,000 reads per cell for high-throughput methods and 750,000 to 1,000,000 reads per cell for low-throughput methods. The scRNA-seq FASTQ files are named with sample names, Illumina flowcell lanes, and library preparing methods. Different fields are separated by dots (.), for example, PBMC2.CC86JANXX.011818-DropSeq.unmapped.1.fastq.gz, where PBMC2 is the sample name (can be Mixture1, Mixture2, PBMC1, PBMC2, Cortex1, and Cortex2), CC86JANXX is the flowcell lanes, 011818 is the library preparation date, and DropSeq is the RNA-seq method (Drop-seq in this case, can be SM2, CELseq, 10X, DropSeq, DroNcSeq, SeqWell, inDrops, and SciSeq). This FASTQ file (read 1) includes cell barcodes and UMI information. The corresponding cDNA reads are in PBMC2.CC86JANXX.011818-DropSeq.unmapped.2.fastq.gz (read2). For more information about the structures of reads from different protocols, please see Supplementary Table 11 of the manuscript. FASTQ files with the same sample name and library preparation method but different flowcell lanes are from the same library but sequenced at different times, e.g., PBMC2.CCLBDANXX.68.011818-DropSeq.unmapped.1.fastq.gz. Therefore, these FASTQ files can be merged together for analyses. For 10x Chromium data, the reads from the same library are split into four files (10X_A, 10X_B, 10X_C, and 10X_D) and the reads from these files can also be merged together.
Web link
Contributor(s) Ding J, Adiconis X, Simmons SK, Kowalczyk MS, Hession CC, Marjanovic ND, Hughes TK, Wadsworth MH, Burks T, Nguyen LT, Kwon JY, Barak B, Ge W, Kedaigle AJ, Carroll S, Li S, Hacohen N, Rozenblatt-Rosen O, Shalek AK, Villani A, Regev A, Levin JZ
Citation(s) 32341560
Submission date May 31, 2019
Last update date Aug 29, 2020
Contact name Aviv Regev
Organization name Broad Institute
Department Klarman Cell Observatory
Lab Aviv Regev
Street address 415 Main Street
City Cambridge
State/province MA
ZIP/Postal code 02142
Country USA
Platforms (5)
GPL16791 Illumina HiSeq 2500 (Homo sapiens)
GPL17021 Illumina HiSeq 2500 (Mus musculus)
GPL18573 Illumina NextSeq 500 (Homo sapiens)
Samples (3986)
GSM3835722 Mixture1.CAPTEANXX.10X_A
GSM3835723 Mixture1.CAPTEANXX.10X_B
GSM3835724 Mixture1.CAPTEANXX.10X_C
BioProject PRJNA545730
SRA SRP200058

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE132044_HEK293_PBMC_TPM_bulk.tsv.gz 455.9 Kb (ftp)(http) TSV
GSE132044_NIH3T3_cortex_TPM_bulk.tsv.gz 399.2 Kb (ftp)(http) TSV
GSE132044_cortex_mm10_cell.tsv.gz 69.8 Kb (ftp)(http) TSV
GSE132044_cortex_mm10_count_matrix.mtx.gz 89.7 Mb (ftp)(http) MTX
GSE132044_cortex_mm10_gene.tsv.gz 198.4 Kb (ftp)(http) TSV
GSE132044_mixture_hg19_mm10_cell.tsv.gz 129.9 Kb (ftp)(http) TSV
GSE132044_mixture_hg19_mm10_count_matrix.mtx.gz 295.6 Mb (ftp)(http) MTX
GSE132044_mixture_hg19_mm10_gene.tsv.gz 467.8 Kb (ftp)(http) TSV
GSE132044_pbmc_hg38_cell.tsv.gz 207.2 Kb (ftp)(http) TSV
GSE132044_pbmc_hg38_count_matrix.mtx.gz 121.6 Mb (ftp)(http) MTX
GSE132044_pbmc_hg38_gene.tsv.gz 233.1 Kb (ftp)(http) TSV
SRA Run SelectorHelp
Processed data are available on Series record
Raw data are available in SRA

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap