|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Jul 01, 2021 |
Title |
NCD rep2 (for LncRNA-seq) |
Sample type |
SRA |
|
|
Source name |
Liver
|
Organism |
Mus musculus |
Characteristics |
treatment: normal chow strain: C57BL/6 tissue: liver age: 10-month-old genotype: wild type
|
Treatment protocol |
C57BL/6 female mice (F0 generation) were fed with either normal chow (NC) or HFD (60% kcal fat) from 1-month to 3-month old, and mated with NC-fed male mice to produce F1 generation. The F2 generation was similarly created. Male offspring of the F0, F1 and F2 generations were intraperitoneally injected with DEN at 2-weeks-old, and kept inmaintained on HFD and named as H1D, H2D and H3D groups, respectively. Male offspring of the F0 generation injected with PBS (Nor group) or DEN (NCD group) at the same age and kept inmaintained on NC. All groups were sacrificed at 40-weeks-old.
|
Growth protocol |
C57BL/6 breeding pairs were housed at a controlled temperature with a 12 hr/12 hr light/dark cycle, with free access to water and food. Animals were handled according to the Guidelines of the China Animal Welfare Legislation and approved by the Committee on Ethics in the Care and Use of Laboratory Animals of College of Life Sciences, Wuhan University.
|
Extracted molecule |
total RNA |
Extraction protocol |
Total RNA from liver samples was extracted individually using RNAiso Plus (Takara Biotechnology, Dalian, China). An equal amount of RNA (n = 4-8) from the same group was combined into 2 samples/group for lncRNA sequencing and 1 sample/group for small RNA sequencing as described previously (Ding et al., 2014). Sequencing was performed by Novogene (Beijing, China). For lncRNA sequencing, briefly, ribosomal RNA was removed by Ribo-zero rRNA Removal Kit (Epicentre, USA), and sequencing libraries were generated. The clustering of the index-coded samples was performed using TruSeq PE Cluster Kit v3-cBot-HS (Illumina). After cluster generation, the libraries were sequenced on an Illumina Hiseq 2500 platform and 125 bp paired-end reads were generated. For small RNA sequencing, briefly, sequencing libraries were generated using NEBNext Multiplex Small RNA Library Prep Set for Illumina (NEB, USA) and index codes were added to attribute sequences to each sample. The clustering of the index-coded samples was performed, after which the library preparations were sequenced on an Illumina Hiseq 2500/2000 platform and 50bp single-end reads were generated. lncRNA RNA-Seq and small RNA RNA-seq
|
|
|
Library strategy |
RNA-Seq |
Library source |
transcriptomic |
Library selection |
cDNA |
Instrument model |
Illumina HiSeq 2000 |
|
|
Description |
Annotated_lncRNA_FPKM.xls Novel_lncRNA_FPKM.xls mRNA_FPKM.xls TUCP_FPKM.xls
|
Data processing |
Quality control: Raw data(raw reads) of fastq format were firstly processed through in-house perl scripts. In this step, clean data(clean reads) were obtained by removing reads containing adapter, reads on containing ploy-N and low quality reads from raw data. At the same time, Q20, Q30 and GC content of the clean data were calculated. All the down stream analyses were based on the clean data with high quality. Mapping to the reference genome: Reference genome and gene model annotation files were downloaded from genome website directly. Index of the reference genome was built using Bowtie v2.0.6 and paired-end clean reads were aligned to the reference genome using TopHat v2.0.9. Transcriptome assembly: The mapped reads of each sample were assembled by both Scripture (beta2) (Guttman et al. 2010) and Cufflinks (v2.1.1) (Trapnell et al. 2010) in a reference-based approach. Both methods use spliced reads to determine exons connectivity, but with two different approaches. Scripture uses a statistical segmentation model to distinguish expressed loci from experimental noise and uses spliced reads to assemble expressed segments. It reports all statistically expressed isoforms in a given locus. Cufflinks uses a probabilistic model to simultaneously assemble and quantify the expression level of a minimal set of isoforms that provides a maximum likelihood explanation of the expression data in a given locus (Cabili et al. 2011). Scripture was run with default parameters, Cufflinks was run with ‘min-frags-per-transfrag=0’ and ‘--library-type’, other parameters were set as default. Conservative analysis: Phast (v1.3) is a software package contains much of statistical programs, most used in phylogenetic analysis (Siepel, et al. 2005), and phastCons is a conservation scoring and identificating program of conserved elements. We used phyloFit to compute phylogenetic models for conserved and non-conserved regions among species and then gave the model and HMM transition parameters to phyloP to compute a set of conservation scores of lncRNA and coding genes. Quantification of gene expression level: Cuffdiff (v2.1.1) was used to calculate FPKMs of both lncRNAs and coding genes in each sample (Trapnell, C. et al. 2010). Gene FPKMs were computed by summing the FPKMs of transcripts in each gene group. FPKM means fragments per kilo-base of exon per million fragments mapped, calculated based on the length of the fragments and reads count mapped to this fragment. Differential expression analysis: Cuffdiff provides statistical routines for determining differential expression in digital transcript or gene expression data using a model based on the negative binomial distribution (Trapnell, C. et al. 2010). Transcripts with an P-adjust <0.05 were assigned as differentially expressed. GO and KEGG enrichment analysis: Gene Ontology (GO) enrichment analysis of differentially expressed genes or lncRNA target genes were implemented by the GOseq R package, in which gene length bias was corrected. GO terms with corrected Pvalue less than 0.05 were considered significantly enriched by differential expressed genes. KEGG is a database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from molecular-level information, especially large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies (http://www.genome.jp/kegg/). We used KOBAS software to test the statistical enrichment of differential expression genes or lncRNA target genes in KEGG pathways PPI analysis: PPI analysis of differentially expressed genes was based on the STRING database, which known and predicted Protein-Protein Interactions. For the species existing in the database, we construct the networks by extract the target gene list from the database; Otherwise, Blastx (v2.2.28) was used to align the target gene sequences to the selected reference protein sequences, and then the networks was built according to the known interaction of selected reference species. Alternative splicing analysis: Alternative splicing events were classified to 12 basic types by the software Asprofile v1.0. The number of AS events in each sample was estimated, separately. SNP analysis: Picard-tools v1.96 and samtools v0.1.18 were used to sort, mark duplicated reads and reorder the bam alignment results of each sample. GATK2 software was used to perform SNP calling. Genome_build: Mus_musculus, GRCm38, ensembl 79 Supplementary_files_format_and_content: Zip files of all processed data of lncRNA and small RNA for samples including Nor, NCD, H1D, H2D, H3D
|
|
|
Submission date |
Jul 23, 2018 |
Last update date |
Jul 01, 2021 |
Contact name |
Yu Sun |
E-mail(s) |
1225194625@qq.com
|
Phone |
18707192719
|
Organization name |
Wuhan University
|
Street address |
Luojiashan Street
|
City |
Wuhan |
ZIP/Postal code |
430072 |
Country |
China |
|
|
Platform ID |
GPL13112 |
Series (1) |
GSE117539 |
Maternal multi-generational high fat diet (HFD) exposure increases the offspring HCC incidence |
|
Relations |
BioSample |
SAMN09704578 |
SRA |
SRX4420721 |
Supplementary data files not provided |
SRA Run Selector |
Raw data are available in SRA |
Processed data are available on Series record |
|
|
|
|
|