NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM3302798 Query DataSets for GSM3302798
Status Public on Jul 01, 2021
Title Nor rep2 (for LncRNA-seq)
Sample type SRA
 
Source name Liver
Organism Mus musculus
Characteristics treatment: PBS
strain: C57BL/6
tissue: liver
age: 10-month-old
genotype: wild type
Treatment protocol C57BL/6 female mice (F0 generation) were fed with either normal chow (NC) or HFD (60% kcal fat) from 1-month to 3-month old, and mated with NC-fed male mice to produce F1 generation. The F2 generation was similarly created. Male offspring of the F0, F1 and F2 generations were intraperitoneally injected with DEN at 2-weeks-old, and kept inmaintained on HFD and named as H1D, H2D and H3D groups, respectively. Male offspring of the F0 generation injected with PBS (Nor group) or DEN (NCD group) at the same age and kept inmaintained on NC. All groups were sacrificed at 40-weeks-old.
Growth protocol C57BL/6 breeding pairs were housed at a controlled temperature with a 12 hr/12 hr light/dark cycle, with free access to water and food. Animals were handled according to the Guidelines of the China Animal Welfare Legislation and approved by the Committee on Ethics in the Care and Use of Laboratory Animals of College of Life Sciences, Wuhan University.
Extracted molecule total RNA
Extraction protocol Total RNA from liver samples was extracted individually using RNAiso Plus (Takara Biotechnology, Dalian, China). An equal amount of RNA (n = 4-8) from the same group was combined into 2 samples/group for lncRNA sequencing and 1 sample/group for small RNA sequencing as described previously (Ding et al., 2014). Sequencing was performed by Novogene (Beijing, China).
For lncRNA sequencing, briefly, ribosomal RNA was removed by Ribo-zero rRNA Removal Kit (Epicentre, USA), and sequencing libraries were generated. The clustering of the index-coded samples was performed using TruSeq PE Cluster Kit v3-cBot-HS (Illumina). After cluster generation, the libraries were sequenced on an Illumina Hiseq 2500 platform and 125 bp paired-end reads were generated. For small RNA sequencing, briefly, sequencing libraries were generated using NEBNext Multiplex Small RNA Library Prep Set for Illumina (NEB, USA) and index codes were added to attribute sequences to each sample. The clustering of the index-coded samples was performed, after which the library preparations were sequenced on an Illumina Hiseq 2500/2000 platform and 50bp single-end reads were generated.
lncRNA RNA-Seq and small RNA RNA-seq
 
Library strategy RNA-Seq
Library source transcriptomic
Library selection cDNA
Instrument model Illumina HiSeq 2000
 
Description Annotated_lncRNA_FPKM.xls
Novel_lncRNA_FPKM.xls
mRNA_FPKM.xls
TUCP_FPKM.xls
Data processing Quality control: Raw data(raw reads) of fastq format were firstly processed through in-house perl scripts. In this step, clean data(clean reads) were obtained by removing reads containing adapter, reads on containing ploy-N and low quality reads from raw data. At the same time, Q20, Q30 and GC content of the clean data were calculated. All the down stream analyses were based on the clean data with high quality.
Mapping to the reference genome: Reference genome and gene model annotation files were downloaded from genome website directly. Index of the reference genome was built using Bowtie v2.0.6 and paired-end clean reads were aligned to the reference genome using TopHat v2.0.9.
Transcriptome assembly: The mapped reads of each sample were assembled by both Scripture (beta2) (Guttman et al. 2010) and Cufflinks (v2.1.1) (Trapnell et al. 2010) in a reference-based approach. Both methods use spliced reads to determine exons connectivity, but with two different approaches. Scripture uses a statistical segmentation model to distinguish expressed loci from experimental noise and uses spliced reads to assemble expressed segments. It reports all statistically expressed isoforms in a given locus. Cufflinks uses a probabilistic model to simultaneously assemble and quantify the expression level of a minimal set of isoforms that provides a maximum likelihood explanation of the expression data in a given locus (Cabili et al. 2011). Scripture was run with default parameters, Cufflinks was run with ‘min-frags-per-transfrag=0’ and ‘--library-type’, other parameters were set as default.
Conservative analysis: Phast (v1.3) is a software package contains much of statistical programs, most used in phylogenetic analysis (Siepel, et al. 2005), and phastCons is a conservation scoring and identificating program of conserved elements. We used phyloFit to compute phylogenetic models for conserved and non-conserved regions among species and then gave the model and HMM transition parameters to phyloP to compute a set of conservation scores of lncRNA and coding genes.
Quantification of gene expression level: Cuffdiff (v2.1.1) was used to calculate FPKMs of both lncRNAs and coding genes in each sample (Trapnell, C. et al. 2010). Gene FPKMs were computed by summing the FPKMs of transcripts in each gene group. FPKM means fragments per kilo-base of exon per million fragments mapped, calculated based on the length of the fragments and reads count mapped to this fragment.
Differential expression analysis: Cuffdiff provides statistical routines for determining differential expression in digital transcript or gene expression data using a model based on the negative binomial distribution (Trapnell, C. et al. 2010). Transcripts with an P-adjust <0.05 were assigned as differentially expressed.
GO and KEGG enrichment analysis: Gene Ontology (GO) enrichment analysis of differentially expressed genes or lncRNA target genes were implemented by the GOseq R package, in which gene length bias was corrected. GO terms with corrected Pvalue less than 0.05 were considered significantly enriched by differential expressed genes. KEGG is a database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from molecular-level information, especially large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies (http://www.genome.jp/kegg/). We used KOBAS software to test the statistical enrichment of differential expression genes or lncRNA target genes in KEGG pathways
PPI analysis: PPI analysis of differentially expressed genes was based on the STRING database, which known and predicted Protein-Protein Interactions. For the species existing in the database, we construct the networks by extract the target gene list from the database; Otherwise, Blastx (v2.2.28) was used to align the target gene sequences to the selected reference protein sequences, and then the networks was built according to the known interaction of selected reference species.
Alternative splicing analysis: Alternative splicing events were classified to 12 basic types by the software Asprofile v1.0. The number of AS events in each sample was estimated, separately.
SNP analysis: Picard-tools v1.96 and samtools v0.1.18 were used to sort, mark duplicated reads and reorder the bam alignment results of each sample. GATK2 software was used to perform SNP calling.
Genome_build: Mus_musculus, GRCm38, ensembl 79
Supplementary_files_format_and_content: Zip files of all processed data of lncRNA and small RNA for samples including Nor, NCD, H1D, H2D, H3D
 
Submission date Jul 23, 2018
Last update date Jul 01, 2021
Contact name Yu Sun
E-mail(s) 1225194625@qq.com
Phone 18707192719
Organization name Wuhan University
Street address Luojiashan Street
City Wuhan
ZIP/Postal code 430072
Country China
 
Platform ID GPL13112
Series (1)
GSE117539 Maternal multi-generational high fat diet (HFD) exposure increases the offspring HCC incidence
Relations
BioSample SAMN09704580
SRA SRX4420719

Supplementary data files not provided
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap