 |
 |
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Dec 31, 2019 |
Title |
Bayesian Correlation is a robust similarity measure for single cell RNA-seq data |
Organism |
Mus musculus |
Experiment type |
Expression profiling by high throughput sequencing
|
Summary |
Single-cell analysis of the transcriptome deepens our understanding of an individual cell's contribution to its microenvironment. Using single-cell analysis to study complex biological processes requires state-of-the-art computational tools. Assessing similarity is highly important for bioinformatics algorithms in order to determine correlations between biological information. Similarity can appear by chance, particularly for low expressed entities. This is especially relevant in single cell RNA-seq (scRNA-seq) because the read counts obtained are lower compared to bulk RNA-sequencing and therefore classic bioinformatics tools are insufficient to obtain reproducible results. Recently, a Bayesian correlation scheme, that assigns low correlation values to correlations coming from low expressed genes, has been proposed to assess similarity for bulk RNA-seq and miRNA. This Bayesian method uses a prior distribution before using empirical evidence. Our goal was to extend the properties of this Bayesian correlation scheme to scRNA-seq data. We assessed 3 ways to compute similarity. First, we computed the similarity of each pair of genes over all cells. Second, we identified specific cell populations and computed the correlation in those specific cells. Third, we computed the similarity of each pair of genes over all clusters, by including the total mRNA expression in those cells. To study the effect of the number of cells on the method, we did not rely on simulated data, we generated 4 scRNA-seq mouse liver cell libraries with a varying number of input cells. Results: We show that Bayesian correlations are more reproducible than Pearson correlations in all the scenarios studied. Compared to Pearson correlations, Bayesian correlations have a smaller dependence on the number of input cells. We demonstrate that the Bayesian correlation algorithm assigns high similarity values to genes with a biological relevance in a specific population. Significance: Our results demonstrate that Bayesian correlation is a robust similarity measure for scRNA-seq datasets. The Bayesian method allows researchers to study similarity between pairs of genes without discarding low expressed entities and to minimize biasing the results by fake correlations. Taken together, using our method of Bayesian correlation the reproducibility of scRNA-seq experiments is increased significantly.
|
|
|
Overall design |
4 single cell RNA-seq liver samples with Parenchymal and non-parenchymal cells were sequenced from the same animal varying the number of input cells: 1000, 2000, 5000 and 10000.
|
Web link |
https://academic.oup.com/nargab/article/2/1/lqaa002/5715215
|
|
|
Contributor(s) |
Sanchez-Taltavull D, Perkins TJ, Dommann N, Melin N, Keogh A, Stroka D, Beldi G |
Citation(s) |
33575552, 33866021 |
Submission date |
Jul 11, 2019 |
Last update date |
Jul 27, 2021 |
Contact name |
Daniel Sanchez-Taltavull |
E-mail(s) |
daniel.sanchez@dbmr.unibe.ch
|
Phone |
+41316328741
|
Organization name |
University of Bern
|
Department |
Department of BioMedical Research
|
Lab |
Visceral Surgery
|
Street address |
Murtenstrasse 35
|
City |
Bern |
ZIP/Postal code |
3013 |
Country |
Switzerland |
|
|
Platforms (1) |
GPL24247 |
Illumina NovaSeq 6000 (Mus musculus) |
|
Samples (4)
|
|
Relations |
BioProject |
PRJNA554015 |
SRA |
SRP214201 |
Supplementary file |
Size |
Download |
File type/resource |
GSE134134_RAW.tar |
22.6 Mb |
(http)(custom) |
TAR (of CSV) |
SRA Run Selector |
Raw data are available in SRA |
Processed data provided as supplementary file |
|
|
|
|
 |