|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on May 01, 2017 |
Title |
human invivo dsRNA-seq rep1 |
Sample type |
SRA |
|
|
Source name |
HEK293T cells
|
Organism |
Homo sapiens |
Characteristics |
library strategy: dsRNA-seq enzyme treatment: proteinase K then ssRNase in vivo ngs platform: Illumina Hi-Seq2000 50 nt strand-specific cell line: HEK293T
|
Treatment protocol |
For all the in vivo dsRNA-seq and ssRNA-seq libraries, a 37% formaldehyde solution (Sigma, St. Louis, MO) was added drop-wise with mixing directly to cell culture dishes containing 90% confluent cells to a final concentration of 1% and incubated at room temperature for 10 minutes to cross-link the RNAs with proteins. The cross-linking reaction was quenched with 1 M glycine (Sigma, St Louis, MO) at a final concentration of 125 mM for 5 minutes with mixing. Then, cells were washed twice with ice-cold PBS and collected.
|
Growth protocol |
For invivo HEK293T dsRNA-seq and ssRNA-seq libraries, HEK293T cells were seeded in 15-cm standard Corning tissue-culture treated culture dishes (Sigma, St Louis, MO), grown to 90% confluence (approximately 18 million cells) in DMEM media (Life Technologies, San Diego, CA) supplemented with L-glutamine, 4.5 g/L D-glucose, 10% fetal bovine serum (FBS (Atlanta Biologics, Atlanta, GA)) and Pen/Strep (Fisher Scientific, Waltham, MA).
|
Extracted molecule |
total RNA |
Extraction protocol |
For all the in vitro dsRNA-seq and ssRNA-seq libraries, RNAs were treated with ssRNase (RNaseOne) and dsRNase (RNase V1), respectively; For mRNA-seq libraries poly A+ RNAs were isolated from total RNA using Dynabeads mRNA direct kit (Ambion, Austin, TX); For smRNA-seq libraries RNAs were prepared as previously described in F. Li et al., The Plant cell 24, 4346. Briefly, total RNA was run on a 15% TBE-Urea Gel and gel slices containing 15-35 nt RNAs were isolated and eluted from acrylimalyde by mixing with 0.3 M NaCl for 4 hours at RT. The dsRNA-seq, ssRNA-seq, mRNA-seq and smRNA-seq libraries were constructed as previously described in F. Li et al., The Plant cell 24, 4346. The 3’ and 5’ adapters, Reverse Transcriptase Primer, 5’ PCR primer, and 3’ PCR Primers matching the TruSeq small RNA sequencing kit (Illumina, San Diego, CA) were synthesized and HPLC purified by IDT (Coralville, IA).
|
|
|
Library strategy |
OTHER |
Library source |
transcriptomic |
Library selection |
other |
Instrument model |
Illumina HiSeq 2000 |
|
|
Description |
dsRNA (double-stranded RNA) RNA is treated with ssRNase (RNaseOne) in vivo human_refGene_mRNA_invivo_HEK293T_structScore.txt human_refGene_mRNA_invivo_HEK293T_struct_hotspot_anno.txt
|
Data processing |
Quality control of the reads. The quality scores for the raw read were carefully checked for their average values, minimum, maximum and standard deviations to make sure all library qualities were in good and comparable status. The correct labeling of 3'-adapters and multiplex indices were also checked to confirm the libraries. Trimming of 3'-adapters. Reads were trimmed to remove 3'-adapters using Cutadapt (v1.0) program with ≤10% mismatches and ≥10 nt aligned bases. Reads ≤15 nt after trimming were discarded before all subsequent analysis. Untrimmed reads were kept separately from trimmed reads and processed independently in following steps. Reducing to NR-tags. Both trimmed and untrimmed reads were reduced to NR-tags to save processing time and space requirements. However, the clone abundance of each NR-tag was retained and used in all subsequent abundance calculations in this study, therefore, we will call reads and NR-tags interchangeably hereafter. Mapping to primate genomes. NR-tags (reads) were mapped to UCSC hg19, refMac2 and rheMac2 genomes using Bowtie (v.0.12.9) program for human, rhesus and cynomolgus, respectively. Note that cynomolgus only has a draft genome release and no corresponding gene model annotations, so we used the reference genome as well as genome annotation of rhesus for all cynomolgus analysis in this study. The mapping parameters for Bowtie program were carefully tuned to insure all alignments with ≤6% mismatches in the first 34 nt seeds and ≤8% mismatches in the whole reads will be reported, and up-to 100 random mapping hits were allowed. Note that for GMUCT reads up-to 10 random hits were allowed. Also note that for all libraries from cynomolgus, the mismatch criteria were 2% looser (8% for seed and 10% for entire read) for the mapping, because they were mapping to a slightly divergent genome instead their own. All mapping results were post-filtered to guarantee the mapping criteria listed above using in-house programs. For reads with multiple hits on the genomes (multiply mapped), a max-diverge filter was also implemented, to select only those hits with a mismatch percentage no more than 4% to the best hit of each read. This filter is similar to UCSC BLAT and Bowtie “--best” mode. All mapping information, such as insert length, clone abundance, and number of locations on the genome were summarized and loaded into local MySQL databases for further queries. Mapping across transcript splicing boundaries. To map across splicing boundaries, we first compiled our own GFF annotation files of transcript exons, including coding mRNAs (RefSeq-mRNA for human, Ensembl-mRNA for rhesus and cynomolgus), all non-coding RNAs (Rfam annotated structural RNAs for human and Ensembl-ncRNAs for rhesus and cynomolgus), and also lincRNAs for human. The unmapped reads from previous steps were collected and remapped to the genomes provided these GFF annotations using TopHap (v2.0.8) program. The parameters were tuned to meet the exact same criteria as for the Bowtie mapped reads. At last, the Tophat hits (spliced reads) were also selected for the max-diverge filter and summaries and stored into local MySQL database. Genome_build: UCSC hg19 for human, UCSC rheMac2 for rhesus and cynomolgus libraries Supplementary_files_format_and_content: The "*_expr.txt" files are tab-delimited text files containing the expression values of all primate mRNA transcripts in raw clone number (weighted), RPM and RPKM values. Supplementary_files_format_and_content: The "*_structScore.txt" files are tab-delimited text files containing the calculated "structure score" of every nucleotide of all primate mRNA transcripts with sufficient dsRNA-seq and ssRNA-seq coverages, as well as other pertinent information such as their gene length, mRNA length, and (weighted) clone number of the dsRNA-seq and ssRNA-seq reads. Supplementary_files_format_and_content: The "*_struct_hotspot_anno.txt" files are tab-delimited text files containing numerous information for all predicted "structure hotspots", defined as long regions with significantly high structure scores in three primates. They also contain many other pertinent information, such as the class, summit, average and max structure score of the hotspots, RPKM values from mRNA-seq, smRNA-seq and public GMUCT libraries for the entire mRNAs or just the structure hotspot regions. Information regarding whether they are bound by the DGCR8/Drosha microprocessor and whether are conserved between human and rhesus is also included.
|
|
|
Submission date |
May 05, 2014 |
Last update date |
May 15, 2019 |
Contact name |
Qi Zheng |
E-mail(s) |
qi.zheng@nemours.org
|
Phone |
3026515778
|
Organization name |
Nemours Children's Health
|
Street address |
1701 Rockland Road
|
City |
Wilmington |
State/province |
DE |
ZIP/Postal code |
19803 |
Country |
USA |
|
|
Platform ID |
GPL11154 |
Series (1) |
GSE57295 |
Regulatory and evolutionary footprints of messenger RNA secondary structure in primate transcriptomes |
|
Relations |
BioSample |
SAMN02743929 |
SRA |
SRX532706 |
Supplementary data files not provided |
SRA Run Selector |
Raw data are available in SRA |
Processed data are available on Series record |
|
|
|
|
|