|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Jun 25, 2024 |
Title |
RETcoh_BM17 |
Sample type |
SRA |
|
|
Source name |
BREAST CANCER
|
Organism |
Homo sapiens |
Characteristics |
tissue: BREAST CANCER cohort: PITT-RCSI patientid: 17_Pitt tumour location: BRAIN METASTATIC TUMOUR
|
Extracted molecule |
total RNA |
Extraction protocol |
RNA was extracted from formalin-fixed paraffin-embedded (FFPE) tissue using the Qiagen AllPrep DNA/RNA FFPE kit, on a QIAcube instrument, following standard protocols. Sample quality and concentration were evaluated using 2100 Expert Software - Bioanalyzer System. 100 ng of total RNA and NUSeq's Illumina Stranded Total RNA-Seq Library Preparation protocol
|
|
|
Library strategy |
RNA-Seq |
Library source |
transcriptomic |
Library selection |
cDNA |
Instrument model |
Illumina NextSeq 500 |
|
|
Description |
Raw data are available via DUA
|
Data processing |
Salmon (v.0.91) was used to perform quasi-mapping of sequencing reads, with seqBias and gcBias corrections enabled, using a 31bp k-mer index of the GRCh38.p10 (GENCODE v.27) human reference transcripts, to estimate transcript abundance for each sample Tximport package was used to import transcript abundance estimates from quant.sf files, generated by Salmon read mapping into R statistical programming environment for gene expression quantification. Transcript abundance estimates were collapsed to gene level gene expression counts. TXI data objects for MAYO and PITT-RCSI RNA-Seq cohort, containing unprocessed Salmon read counts, transcript per million (TPM) and gene length values were combined for subsequent downstream analysis. Genes with little to no expression across all samples were filtered. Filtered gene level counts were normalised with edgeR calcNormFactors() and cpm() function, where counts per million (CPM): CPM = C * 106/N (counts (C) scaled by the library depth in million units). A log2 transformation was applied to CPM: log2(CPM +1), where 1 is a pseudocount added to prevent negative value counts. Filtered log2 CPM and TPM genes were annotated using biomaRt R package to identify protein coding genes using the ENSEMBL GRCh38 p.10 database. In order to correct for between RNA-Seq cohort (MAYO, PITT-RCSI, RCSI-Bmt18) batch driven effect, filtered protein coding gene expression counts were used as input to svaseq R package for batch effect assessment. A multiple linear regression model was fit for each surrogate variable (SV), with the following independent predictors: sequencing batch ID [1,5], disease status (primary breast (P) / brain metastases (M)), estrogen receptor (ER) IHC status (ER+/ER-) and primary tumour histological subtype (DCIS, IDC, ILC, Mixed ILC/IDC). A linear model fitted using SV1 was found to be significantly associated with sequencing batch (P < 0.01; R2 ~0.58). As such, in order to correct for between RNA-Seq cohort (MAYO, PITT-RCSI) batch driven effect, removeBatchEffect() function in LIMMA package in R, was used with SV1, to adjust gene expression values to account for mean shifts in expression driven by the batch driven effect. Assembly: GRCh38.p10 (GENCODE v.27) Supplementary files format and content: .xlsx excel spread sheet containing batch corrected log2 TMM normalised CPM proteing coding gene expression values for all 16 samples
|
|
|
Submission date |
May 23, 2024 |
Last update date |
Jun 25, 2024 |
Contact name |
Leonie Young |
E-mail(s) |
lyoung@rcsi.com
|
Organization name |
RCSI
|
Department |
Surgery
|
Lab |
Endocrine Oncology Research Group
|
Street address |
31A York Street
|
City |
Dublin |
State/province |
Dublin |
ZIP/Postal code |
D02 HX03 |
Country |
Ireland |
|
|
Platform ID |
GPL18573 |
Series (1) |
GSE268217 |
RET overexpression leads to increased brain metastatic competency in luminal breast cancer patients |
|
Supplementary data files not provided |
Raw data are available in SRA |
|
|
|
|
|