GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM8288089

Query DataSets for GSM8288089

Status

Public on Jun 25, 2024

Title

RETcoh_BM17

Sample type

SRA

Source name

BREAST CANCER

Organism

Homo sapiens

Characteristics

tissue: BREAST CANCER
cohort: PITT-RCSI
patientid: 17_Pitt
tumour location: BRAIN METASTATIC TUMOUR

Extracted molecule

total RNA

Extraction protocol

RNA was extracted from formalin-fixed paraffin-embedded (FFPE) tissue using the Qiagen AllPrep DNA/RNA FFPE kit, on a QIAcube instrument, following standard protocols. Sample quality and concentration were evaluated using 2100 Expert Software - Bioanalyzer System.
100 ng of total RNA and NUSeq's Illumina Stranded Total RNA-Seq Library Preparation protocol

Library strategy

RNA-Seq

Library source

transcriptomic

Library selection

cDNA

Instrument model

Illumina NextSeq 500

Description

Raw data are available via DUA

Data processing

Salmon (v.0.91) was used to perform quasi-mapping of sequencing reads, with seqBias and gcBias corrections enabled, using a 31bp k-mer index of the GRCh38.p10 (GENCODE v.27) human reference transcripts, to estimate transcript abundance for each sample
Tximport package was used to import transcript abundance estimates from quant.sf files, generated by Salmon read mapping into R statistical programming environment for gene expression quantification. Transcript abundance estimates were collapsed to gene level gene expression counts. TXI data objects for MAYO and PITT-RCSI RNA-Seq cohort, containing unprocessed Salmon read counts, transcript per million (TPM) and gene length values were combined for subsequent downstream analysis.
Genes with little to no expression across all samples were filtered.
Filtered gene level counts were normalised with edgeR calcNormFactors() and cpm() function, where counts per million (CPM): CPM = C * 106/N (counts (C) scaled by the library depth in million units). A log2 transformation was applied to CPM: log2(CPM +1), where 1 is a pseudocount added to prevent negative value counts. Filtered log2 CPM and TPM genes were annotated using biomaRt R package to identify protein coding genes using the ENSEMBL GRCh38 p.10 database.
In order to correct for between RNA-Seq cohort (MAYO, PITT-RCSI, RCSI-Bmt18) batch driven effect, filtered protein coding gene expression counts were used as input to svaseq R package for batch effect assessment. A multiple linear regression model was fit for each surrogate variable (SV), with the following independent predictors: sequencing batch ID [1,5], disease status (primary breast (P) / brain metastases (M)), estrogen receptor (ER) IHC status (ER+/ER-) and primary tumour histological subtype (DCIS, IDC, ILC, Mixed ILC/IDC). A linear model fitted using SV1 was found to be significantly associated with sequencing batch (P < 0.01; R2 ~0.58). As such, in order to correct for between RNA-Seq cohort (MAYO, PITT-RCSI) batch driven effect, removeBatchEffect() function in LIMMA package in R, was used with SV1, to adjust gene expression values to account for mean shifts in expression driven by the batch driven effect.
Assembly: GRCh38.p10 (GENCODE v.27)
Supplementary files format and content: .xlsx excel spread sheet containing batch corrected log2 TMM normalised CPM proteing coding gene expression values for all 16 samples

Submission date

May 23, 2024

Last update date

Jun 25, 2024

Contact name

Leonie Young

E-mail(s)

lyoung@rcsi.com

Organization name

RCSI

Department

Surgery

Lab

Endocrine Oncology Research Group

Street address

31A York Street

City

Dublin

State/province

Dublin

ZIP/Postal code

D02 HX03

Country

Ireland

Platform ID

GPL18573

Series (1)

GSE268217

RET overexpression leads to increased brain metastatic competency in luminal breast cancer patients

Supplementary data files not provided

Raw data are available in SRA