NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM8288094 Query DataSets for GSM8288094
Status Public on Jun 25, 2024
Title RETcoh_BM62
Sample type SRA
 
Source name BREAST CANCER
Organism Homo sapiens
Characteristics tissue: BREAST CANCER
cohort: PITT-RCSI
patientid: 62_Pitt
tumour location: BRAIN METASTATIC TUMOUR
Extracted molecule total RNA
Extraction protocol RNA was extracted from formalin-fixed paraffin-embedded (FFPE) tissue using the Qiagen AllPrep DNA/RNA FFPE kit, on a QIAcube instrument, following standard protocols. Sample quality and concentration were evaluated using 2100 Expert Software - Bioanalyzer System.
100 ng of total RNA and NUSeq's Illumina Stranded Total RNA-Seq Library Preparation protocol
 
Library strategy RNA-Seq
Library source transcriptomic
Library selection cDNA
Instrument model Illumina NextSeq 500
 
Description Raw data are available via DUA
Data processing Salmon (v.0.91) was used to perform quasi-mapping of sequencing reads, with seqBias and gcBias corrections enabled, using a 31bp k-mer index of the GRCh38.p10 (GENCODE v.27) human reference transcripts, to estimate transcript abundance for each sample
Tximport package was used to import transcript abundance estimates from quant.sf files, generated by Salmon read mapping into R statistical programming environment for gene expression quantification. Transcript abundance estimates were collapsed to gene level gene expression counts. TXI data objects for MAYO and PITT-RCSI RNA-Seq cohort, containing unprocessed Salmon read counts, transcript per million (TPM) and gene length values were combined for subsequent downstream analysis.
Genes with little to no expression across all samples were filtered.
Filtered gene level counts were normalised with edgeR calcNormFactors() and cpm() function, where counts per million (CPM): CPM = C * 106/N (counts (C) scaled by the library depth in million units). A log2 transformation was applied to CPM: log2(CPM +1), where 1 is a pseudocount added to prevent negative value counts. Filtered log2 CPM and TPM genes were annotated using biomaRt R package to identify protein coding genes using the ENSEMBL GRCh38 p.10 database.
In order to correct for between RNA-Seq cohort (MAYO, PITT-RCSI, RCSI-Bmt18) batch driven effect, filtered protein coding gene expression counts were used as input to svaseq R package for batch effect assessment. A multiple linear regression model was fit for each surrogate variable (SV), with the following independent predictors: sequencing batch ID [1,5], disease status (primary breast (P) / brain metastases (M)), estrogen receptor (ER) IHC status (ER+/ER-) and primary tumour histological subtype (DCIS, IDC, ILC, Mixed ILC/IDC). A linear model fitted using SV1 was found to be significantly associated with sequencing batch (P < 0.01; R2 ~0.58). As such, in order to correct for between RNA-Seq cohort (MAYO, PITT-RCSI) batch driven effect, removeBatchEffect() function in LIMMA package in R, was used with SV1, to adjust gene expression values to account for mean shifts in expression driven by the batch driven effect.
Assembly: GRCh38.p10 (GENCODE v.27)
Supplementary files format and content: .xlsx excel spread sheet containing batch corrected log2 TMM normalised CPM proteing coding gene expression values for all 16 samples
 
Submission date May 23, 2024
Last update date Jun 25, 2024
Contact name Leonie Young
E-mail(s) lyoung@rcsi.com
Organization name RCSI
Department Surgery
Lab Endocrine Oncology Research Group
Street address 31A York Street
City Dublin
State/province Dublin
ZIP/Postal code D02 HX03
Country Ireland
 
Platform ID GPL18573
Series (1)
GSE268217 RET overexpression leads to increased brain metastatic competency in luminal breast cancer patients

Supplementary data files not provided
Raw data are available in SRA

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap