GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
Series GSE30567 Query DataSets for GSE30567
Status Public on Jul 13, 2011
Title ENCODE Cold Spring Harbor Labs Long RNA-seq (hg19)
Project ENCODE
Organism Homo sapiens
Experiment type Expression profiling by high throughput sequencing
Summary This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Carrie Davis (experimental), Alex Dobin (computational), Felix Schlesinger (computational), Tom Gingeras (primary investigator), and Roderic Guigo's group at the CRG). If you have questions about the Genome Browser track associated with this data, contact ENCODE (

These tracks were generate by the ENCODE Consortium. They contain information about human RNAs > 200 nucleotides in length obtained as short reads off the Illumina GAIIx platform. Data is available from biological replicates of several cell lines. In addition to profiling Poly-A+ and Poly-A- RNA from whole cells, we have also gather data from various subcellular compartments. In many cases, there are Cap Analysis of Gene Expression (CAGE, RIKEN Institute) and Small RNA-Seq (<200 nucleotides, CSHL) and Pair-End di-TAG-RNA (PET-RNA, Genome Institute of Singapore) datasets available from the same biological replicates.

For data usage terms and conditions, please refer to and
Overall design We are using the published protocol This protocol generates directional libraries and reports the transcripts strand of origin. Exogenous RNA spike-ins (Round 5, pool 14), in development at National Institutes Standards Technology were added to each endogenous RNA isolate and carried through library construction and sequencing. The Illumina PhiX control library was also spiked-in at 1% to each completed human library just prior to cluster formation. Accompanying each RNA-Seq dataset is a "Production Document". This document contains details about the RNA isolations and treatments, library construction, spike-ins as well as quality control figures for individual libraries. The spike-in sequence and the concentrations can are available for download in the supplemental directory.
The libraries are sequenced on the Illumina platform to an average depth of ~200 million reads (100 million mate-pairs). The data are mapped against hg19 using Spliced Transcript Alignment and Reconstruction (STAR) written by Alex Dobin (CSHL). More information, about STAR including the parameters used for these data can be found at:
Additionally, we provide the following processed "element" data files: de novo splice junctions, de novo transcripts, and contigs. These elements are assessed for reproducibility using a nonparametric irreproducible detection (IDR) rate script. The IDR values for each element are included in the files for end-users to threshold on. An IDR value of 0.1 means that the probability of detecting that element in a third experiment equivalent in depth to the the sum of the bioreplicates is 90%. In addition, we also compute expression values for annotated genes, transcripts and exons.
Web link
Contributor(s) Davis C, Dobin A, Schlesinger F, Gingeras T
Citation(s) 22019781, 22955974, 22955988, 25582907
BioProject PRJNA30709
Submission date Jul 11, 2011
Last update date May 15, 2019
Contact name ENCODE DCC
Organization name ENCODE DCC
Street address 300 Pasteur Dr
City Stanford
State/province CA
ZIP/Postal code 94305-5120
Country USA
Platforms (3)
GPL9115 Illumina Genome Analyzer II (Homo sapiens)
GPL10999 Illumina Genome Analyzer IIx (Homo sapiens)
GPL11154 Illumina HiSeq 2000 (Homo sapiens)
Samples (99)
GSM758559 CSHL_RnaSeq_GM12878_cell_longPolyA (superseded by GSE86658)
GSM758560 CSHL_RnaSeq_GM12878_cytosol_longPolyA (superseded by GSE90222)
GSM758561 CSHL_RnaSeq_AG04450_cell_longPolyA (superseded by GSE78585)
This SubSeries is part of SuperSeries:
GSE26284 ENCODE Cold Spring Harbor Labs Long RNA-seq
SRA SRP007461

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE30567_RAW.tar 210.4 Gb (http)(custom) TAR (of BAM, BEDRNAELEMENTS, BIGWIG, GFF, GTF, PDF)
GSE30567_run_info.txt.gz 9.5 Kb (ftp)(http) TXT
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap