NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM3293679 Query DataSets for GSM3293679
Status Public on May 17, 2019
Title WT_N_3 (5711)
Sample type SRA
 
Source name normal mammary epithelial cells
Organism Mus musculus
Characteristics strain: C3H/HeJ
genotyping: Wild type
age: 270 days
Extracted molecule polyA RNA
Extraction protocol RNA extracted from tumors using TRI reagent (Biolab), then chloroform was added. The upper layer was taken and washed with isopropanol and then with ethanol 75%. The RNA was eluted using DPEC water
Libraries were made using KAPA Single-Indexed Adapter Kit (Illumina, Massachusetts, USA)
 
Library strategy RNA-Seq
Library source transcriptomic
Library selection cDNA
Instrument model Illumina NextSeq 500
 
Description DESeq2_final 14-7-2018
Data processing Trimming and filtering of raw reads The NextSeq basecalls files were converted to fastq files using the bcl2fastq (v2.17.1.14) program with default parameters. The provided SampleSheet.csv file contained samples' names and barcodes only, so no trimming or filtering was done at this stage and a fastq file was created for each sample separately. Raw reads (fastq files) were inspected for quality issues with FastQC (v0.11.2, http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). According to the FastQC report, reads were quality-trimmed at both ends, using in-house Perl scripts, with a quality threshold of 32. In short, the scripts use a sliding window of 5 bases from the read's end and trim one base at a time until the average quality of the window passes the given threshold. Following quality-trimming, adapter sequences were removed with cutadapt (version 1.7.1, http://cutadapt.readthedocs.org/en/stable/), using a minimal overlap of 1 (-O parameter), allowing for read wildcards, and filtering out reads that became shorter than 15 nt (-m parameter). The remaining reads were further filtered to remove very low quality reads, using the fastq_quality_filter program of the FASTX package (version 0.0.13, http://hannonlab.cshl.edu/fastx_toolkit/), with a quality threshold of 20 at 90 percent or more of the read's positions.
Mapping and differential expression analysis The processed fastq files were mapped to the mouse transcriptome and genome using TopHat (v2.0.13). The genome version was GRCm38, with annotations from Ensembl release 84. Mapping allowed up to 5 mismatches per read, a maximum gap of 5 bases, and a total edit distance of 10 (full command: tophat -G genes.gtf -N 5 --read-gap-length 5 --read-edit-dist 10 --segment-length 20 --read-realign-edit-dist 3 --library-type fr-firststrand genome processed.fastq). For the statistics file, quantification was done using htseq-count (version 0.6.0, http://www-huber.embl.de/users/anders/HTSeq/doc/count.html). Strand information was set to 'reverse', and an annotation file that lacked information for genes of type IG, TR, Mt, rRNA, tRNA, miRNA, misc_RNA, scRNA, snRNA, snoRNA, sRNA, scaRNA, piRNA, vaultRNA, ribozyme, artifact and LRG_gene, was used. For further analysis, quantification was done with the Cufflinks package (v2.2.1), using the cuffquant program with the genome bias correction (-b parameter), multi-mapped reads assignment algorithm (-u parameter) and masking for genes of the same types as above (IG, TR, etc.) (-M parameter). Strand directionallity was given by the --library-type fr-firststrand parameter. Raw counts were obtained by running cuffnorm on the cuffquant output. Normalization and differential expression were done with the DESeq2 package (version 1.10.1). Genes with a sum of counts less than 2 over all samples were filtered out prior to normalization, then dispersion and size factors were calculated. Differential expression was calculated using a design which included the genotype factor, compensating for any run effects (both P53_WWOX_DKO and WWOX_KO_T genotypes included samples from 2 different runs). Calculations were performed with default parameters. All required comparisons were performed with a significance threshold of padj<0.1, using default parameters. Comparisons between normal and cancer samples were performed with the lfcThreshold (changing the default testing for any change to such that is above a given threshold), setting its value to 0.7 (~1.6 fold). Several quality control assays, such as counts distributions and principal component analysis, as well as differential expression results, were calculated and visualized in R (version 3.2.1, with packages 'RColorBrewer_1.1-2', 'pheatmap_1.0.8' and 'ggplot2_2.1.0'). The principle component analysis showed that sample P53_WWOX_DKO_4 harbored an additional, unknown, source of variation, thus might be an outlier. Hence, an analysis which excluded this sample (using the same parameters otherwise) was also performed (termed All_noDKO4). In addition, samples in this experiment came from two different mice strains, introducing a source of variation which might not be directly related to the different genotypes. For each strain, an analysis was performed which included only its own samples, calculating the intra-strain comparisons (using the same parameters as above). Results (for all analyses) were then combined with gene details (such as symbol, Entrez accession, etc.) taken from the results of a BioMart query (Ensembl, release 84) to produce the final Excel file.
Genome_build: mm10
Supplementary_files_format_and_content: Normalization and differential expression were done with the DESeq2 package (version 1.10.1).
 
Submission date Jul 19, 2018
Last update date May 17, 2019
Contact name Suhaib Khaled Abdeen
E-mail(s) suhibabdin@gmail.com
Organization name Hebrew Univeristy
Department Immunology and cancer research
Lab Rami Aqeilan
Street address Ein kerem
City Jerusalem
ZIP/Postal code 9112101
Country Israel
 
Platform ID GPL19057
Series (1)
GSE117387 Somatic loss of WWOX is associated with TP53 perturbation in basal-like breast cancer
Relations
BioSample SAMN09692894
SRA SRX4409210

Supplementary data files not provided
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap