|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Jun 09, 2015 |
Title |
2i_E6 |
Sample type |
SRA |
|
|
Source name |
Embryonic stem cells
|
Organism |
Mus musculus |
Characteristics |
cell type: Embryonic Stem Cells growth medium: PD0325901 (1 μM), CHIR99021 (3 μM) and LIF (10 ng/ml) cell cycle stage: G1 batch barcode: AAGCTA cell barcode: GGAACT
|
Extracted molecule |
polyA RNA |
Extraction protocol |
Single cell RNA sequencing (no RNA extraction, FACS sorting into 0.6µL lysis buffer containing 10% NP-40) Polyadenylation-site specific single cell transcriptome sequencing using the BATSeq protocol. (Velten et al., in preparation). The protocol is highly multiplexed and uses both batch and cell barcodes, which are given in the 'Characteristics' section. 4 independent sequencing runs were performed.
|
|
|
Library strategy |
RNA-Seq |
Library source |
transcriptomic |
Library selection |
cDNA |
Instrument model |
Illumina MiSeq |
|
|
Description |
Sample 30 Single Cell
|
Data processing |
Demultiplexed using batch and cell barcode (in fwd. and rev. read, respectively). Pyhton2.7/HTSeq Trim off 8 bases of the reverse read (molecular barcode) and store with first read. Discard reverse read. Perl5 Trim off polyA from 3' end of forward read. Discard reads not ending in a poly A tail. Python2.7/HTSeq Align reads to Mus musculus genome GRCm38 (ensembl version 38.73), with the sequences of the ERCC control RNA spike ins appended to it. Gsnap Filter out reads of non-unique alignment, low alignemt quality, alignment to A-rich genomic regions, A-rich reads. Python2.7/HTSeq Count unique molecular barcodes aligning to different genomic features. Python2.7/HTSeq Remove barcode counts stemming from similar molecular barcodes with a Hamming distance of less than 1. Perl5 Collapse putative polyadenylation sites stemming from genomic positions within 12bp distance. Perl5 For the summary files, 27 cells were removed because less than 1000 molecular barcodes were observed. R-3.0.2 Genome_build: GRCm38 Supplementary_files_format_and_content: *isoforms.counts.gz: comma-separated value files containing the ensembl gene ID of the closest genomic feature, the 3' alignment position (polyadenylation site), the Barcode Count, the Read Count, a brief description of the alignment site, the distance to the closest annotated transcript termination site Supplementary_files_format_and_content: TotalGeneCounts.csv contains total counts of molecular barcodes by gene (identified by ensembl gene ID, column 1) for 107 cells which pass the quality control criteria. (columns 2-108); includes some summarisation of the processed data files (removal of polyadenylation site information). Supplementary_files_format_and_content: TotalIsoformCounts.csv contains molecular barcode and read counts (column 3+4) by gene (column 1), polyadenylation site in genomic coordinates (column 2), and Cell (column 7) for 107 cells which pass the quality control criteria. Additional information on distance to closest annotated transcript termination site (column 6) and a brief description of the alignment position (column 5) is also given.
|
|
|
Submission date |
Aug 26, 2014 |
Last update date |
May 15, 2019 |
Contact name |
Lars Velten |
E-mail(s) |
lars.velten@crg.eu
|
Organization name |
CRG
|
Department |
Bioinformatics and Genomics
|
Lab |
Velten lab
|
Street address |
C. Dr. Aiguader 88, P05
|
City |
Barcelona |
ZIP/Postal code |
08003 |
Country |
Spain |
|
|
Platform ID |
GPL16417 |
Series (1) |
|
Relations |
BioSample |
SAMN03009011 |
SRA |
SRX687137 |
Supplementary file |
Size |
Download |
File type/resource |
GSM1488103_testMPX_plate1_E_6_2i-pA.isoforms.counts.txt.gz |
44.5 Kb |
(ftp)(http) |
TXT |
SRA Run Selector |
Raw data are available in SRA |
Processed data provided as supplementary file |
Processed data are available on Series record |
|
|
|
|
|