|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Jul 19, 2016 |
Title |
CA1032 |
Sample type |
SRA |
|
|
Source name |
Whole blood
|
Organism |
Homo sapiens |
Characteristics |
group: Diarrhea age: <2 year-old organisms: Norovirus flowcell: C lane: 2
|
Extracted molecule |
polyA RNA |
Extraction protocol |
Total RNA was isolated using MagMax for Stabilized Blood Tubes RNA Isolation Kit (Ambion, TX) and RNA sample was globin reduced wit GLOBINclear (Ambion, TX) according to manufacturer’s instructions. Libraries were prepared from globin reduced RNA samples using the Illumina TrueSeq RNA Sample Preparation kit according to manufacturer's instructions.
|
|
|
Library strategy |
RNA-Seq |
Library source |
transcriptomic |
Library selection |
cDNA |
Instrument model |
Illumina HiSeq 2500 |
|
|
Data processing |
Base-calling was performed automatically in Illumina BaseSpace after sequencing; FASTQ reads were trimmed in Galaxy in two steps: 1) hard-trimming to remove 1 3'-end base (FASTQ Trimmer tool, v.1.0.0); 2) quality trimming from both ends until minimum base quality for each read >= 30 (FASTQ Quality Trimmer tool, v.1.0.0). Reads were aligned in Galaxy using bowtie and TopHat (Tophat for Illumina tool, v.1.5.0). Read counts per Ensembl gene ID were estimated in Galaxy using htseq-count (htseq-count tool, v.0.4.1). Sequencing, alignment, and quantitation metrics were obtained for FASTQ, BAM/SAM, and count files in Galaxy using FastQC, Picard, TopHat, Samtools, and ht-seq-count. Non-protein coding and mitochondrial genes were filtered out. Samples were selected for further analysis by using these criteria: unpaired reads examined/FASQ total read >0.75 and median CV coverage <1. Samples that had total counts less than one million were excluded. Genes expressed (counts per million >1) in less than 3 samples were removed. Data normalization (TMM) and differential expression analysis were performed in R using "edgeR" package. Genome_build: GRCh38 Supplementary_files_format_and_content: mexico2_data.csv is a comma-separated matrix. The first column contains the Ensembl gene ID and the second contains the HGNC symbol. The remaining columns include read counts assigned for each library. Data represents all processing steps up to read counting, not including gene filtering, normalization, and analysis with edgeR. Supplementary_files_format_and_content: mexico2_combined_metrics.csv is a comma-separated matrix. The first column contains sample ID, remaining columns include RNA sequencing and alignment metrics.
|
|
|
Submission date |
Jul 19, 2016 |
Last update date |
May 15, 2019 |
Contact name |
Scott Presnell |
E-mail(s) |
SPresnell@benaroyaresearch.org
|
Organization name |
Benaroya Research Institute
|
Street address |
1201 Ninth Avenue
|
City |
Seattle |
State/province |
WA |
ZIP/Postal code |
98101 |
Country |
USA |
|
|
Platform ID |
GPL16791 |
Series (1) |
GSE69529 |
Elucidating the etiology and molecular pathogenicity of infectious diarrhea by high throughput RNA sequencing |
|
Relations |
BioSample |
SAMN05416245 |
SRA |
SRX1959892 |
Supplementary data files not provided |
SRA Run Selector |
Processed data are available on Series record |
Raw data are available in SRA |
|
|
|
|
|