NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE255224 Query DataSets for GSE255224
Status Public on Feb 09, 2024
Title Single Nucleotide Polymorphism (SNP) and Antibody-based Cell Sorting (SNACS): A tool for demultiplexing single-cell DNA sequencing data
Organism Homo sapiens
Experiment type Other
Summary Motivation
Recently, single-cell DNA sequencing (scDNA-seq) and multi-modal profiling with the addition of cell-surface antibodies (scDAb-seq) have provided key insights into cancer heterogeneity. Scaling these technologies across large patient cohorts, however, is frequently cost and time prohibitive. Multiplexing, in which cells from unique patients and pooled into a single experiment, offers a possible solution. While multiplexing methods are available for scRNAseq, accurate demultiplexing in scDNAseq remains an unmet need.
Results
Here, we introduce SNACS: Single-Nucleotide Polymorphism (SNP) and Antibody-based Cell Sorting. SNACS relies on a combination of patient-level cell-surface identifiers and natural variation in genetic polymorphisms to demultiplex scDNAseq data. We demonstrated the performance of SNACS on a dataset consisting of multi-sample experiments from patients with leukemia where we knew truth from single-sample experiments from the same patients. Relative to demultiplexing methods derived from the scRNAseq literature, SNACS offered superior accuracy.
Availability Implementation
SNACS is available at https://github.com/olshena/SNACS.
 
Overall design This study includes samples from 8 de-identified adult patients with acute myeloid leukemia. The samples underwent single-cell DNA sequencing with the inclusion of cell-surface antibody-conjugated oligonucleotides to serve as sample-level identifiers. The conditions included combining the samples in multiple combinations to develop and test a novel demultiplexing algorithm as such based on the following schematic: Sample 1: Patient A; Sample 2: Patient B; Sample 3: Patient C; Sample 4: Patient D; Sample 5: Patient A and B; Sample 6: Patient B, C, and D; Sample 7: Patient A, B, C, and D; Sample 8: Patient E and F; Sample 9: Patient E, F, G, and H; Sample 10: Patient A, B, C, D, E, F, G, and H; Sample 11: Patient A, B, C, D, E, F, G, and H. Aside from Sample 11, which is a biologic replicate for Sample 10, no other replicates were included. Samples 1-4 served as controls for Samples 5-7.
The hdf5 files were directly outputted from and open-source cell calling and alignment pipeline for single cell DNA sequencing data (https://github.com/AbateLab/DAb-seq). Data have not been normalized or filtered unless specified below. The hdf5 files contain the following matrices:
AB_DESCRIPTIONS: Names of cell-surface antibodies used
ABS: Antibody counts using various calling methods; rows are single cells and columns are antibodies. Note for this analysis and manuscript, we used the CLR (center log ratio transformed) antibody counts. Other methods are described in https://github.com/AbateLab/DAb-seq/umi_tools/Documentation.py.
AMPLICONS: This includes 2 matrices: (1) read counts for all DNA amplicons; rows are single cells, columns are DNA amplicons and (2) names of all DNA amplicons used in the experiment.
AD: Alternate allele depth; rows are single cells columns are genetic variants (SNPs)
CELL_BARCODES: Names of all called single-cell barcodes in the experiment
DP: Total read depth; rows are single cells columns are genetic variants (SNPs)
GQ: Quality score, ranging from 0 (worst quality) to 100 (best quality); rows are single cells columns are genetic variants (SNPs)
GT: Mutation call where 0 = not mutated, 1 or 2 = mutated, and 3 = NA; rows are single cells columns are genetic variants (SNPs)
RD: Reference allele depth; rows are single cells columns are genetic variants (SNPs)
VARIANTS: Names of all variants (SNPs) in the experiment

This study includes samples from 4 de-identified adult patients with acute myeloid leukemia. The samples underwent single-cell DNA sequencing with the inclusion of cell-surface antibody-conjugated oligonucleotides to serve as sample-level identifiers. The conditions included combining the samples in multiple combinations to develop and test a novel demultiplexing algorithm as such based on the following schematic: Sample 1: Patient A; Sample 2: Patient B; Sample 3: Patient C; Sample 4: Patient D; Sample 5: Patient A and B; Sample 6: Patient B, C, and D; Sample 7: Patient A, B, C, and D. No replicates were included and Samples 1-4 served as controls for Samples 5-7.
 
Contributor(s) Kennedy VE, Roy R, Peretz C, Koh A, Tran E, Smith C, Olshen A
Citation(s) 38370638
Submission date Feb 06, 2024
Last update date Jul 22, 2024
Contact name Catherine Smith
E-mail(s) Catherine.Smith@ucsf.edu
Organization name University of California, San Francisco
Department Medicine
Lab Smith Lab
Street address 513 Parnassus Avenue
City San Francisco
State/province California
ZIP/Postal code 94143
Country USA
 
Platforms (1)
GPL24676 Illumina NovaSeq 6000 (Homo sapiens)
Samples (15)
GSM8066757 HSPCs, barcode-derived cDNA, 1
GSM8066758 HSPCs, barcode-derived cDNA, 2
GSM8066759 HSPCs, barcode-derived cDNA, 3
Relations
BioProject PRJNA1073979

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE255224_Experiment10.genotypes.hdf5 147.4 Mb (ftp)(http) HDF5
GSE255224_Experiment11.genotypes.hdf5 100.3 Mb (ftp)(http) HDF5
GSE255224_Experiment8.genotypes.hdf5 67.4 Mb (ftp)(http) HDF5
GSE255224_Experiment9.genotypes.hdf5 28 b (ftp)(http) HDF5
GSE255224_RAW.tar 642.5 Mb (http)(custom) TAR (of HDF5)
GSE255224_SNACS_ADT_features.csv.gz 378 b (ftp)(http) CSV
SRA Run SelectorHelp
Raw data are available in SRA

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap