GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM5400803

Query DataSets for GSM5400803

Status

Public on Jun 26, 2021

Title

LCC_Omen

Sample type

SRA

Source name

Clinical sample

Organism

Homo sapiens

Characteristics

tissue: CaOv tumor

Extracted molecule

total RNA

Extraction protocol

RNeasy® PowerLyzer® Tissue & Cells Kit was used for the isolation of total RNA from ovarian tumor tissues according to the manufacturer's protocol.
RNA libraries were prepared for sequencing using standard Illumina protocols.

Library strategy

RNA-Seq

Library source

transcriptomic

Library selection

cDNA

Instrument model

Illumina HiSeq 2000

Data processing

Illumina Casava1.7 software used for basecalling.
Raw data (raw reads) of fastq format were processed with fastp, an ultra-fast FASTQ preprocessor with useful quality. Clean data (clean reads) were obtained after quality control, adapter trimming, quality filtering and per-read quality cutting. All the downstream analyses were based on the clean data with high quality.
Reference genome and gene model annotation files were downloaded from genome website directly. Paired-end clean reads were aligned to the reference genome using HISAT2 v2.1.0 (hierarchical indexing for spliced alignment of transcripts), which is a highly efficient system for aligning reads from RNA sequencing experiments. HISAT2 uses an indexing scheme based on the Burrows-Wheeler transform and the Ferragina-Manzini (FM) index, employing two types of indexes for alignment: a whole_x0002_genome FM index to anchor each alignment and numerous local FM indexes for very rapid extensions of these alignments. HISAT2 is the fastest system currently available, with equal or better accuracy than any other method. HISAT2 was run with the default parameters.
Featurecount was used to count the reads numbers mapped to each gene. And then FPKM of each gene was calculated based on the length of the gene and reads count mapped to this gene. FPKM, expected number of Fragments Per Kilobase of transcript sequence per Millions base pairs sequenced, considers the effect of sequencing depth and gene length for the reads count at the same time, and is currently the most commonly used method for estimating gene expression levels (Trapnell, Cole, et al., 2010).
Differential expression analysis of two conditions/groups (two biological replicates per condition) was performed using the DESeq R package (1.18.1). DESeq provide statistical routines for determining differential expression in digital gene expression data using a model based on the negative binomial distribution. The resulting P-values were adjusted using the Benjamini and Hochberg’s approach for controlling the false discovery rate. Genes with |log2(FoldChange)| > 1 and adjusted P_x0002_value
Genome_build: Homo sapiens reference genome (GRCh38)
Supplementary_files_format_and_content: Matrix table with raw gene counts for every gene and every sample.

Submission date

Jun 25, 2021

Last update date

Jun 26, 2021

Contact name

Mingo Ming Ho YUNG

E-mail(s)

h1094157@connect.hku.hk

Organization name

HKU

Department

O&G