Genome binding/occupancy profiling by high throughput sequencing Expression profiling by high throughput sequencing Other
Summary
Pooled CRISPR-Cas9 screens have recently emerged as a powerful method for functionally characterizing regulatory elements in the non-coding genome, but off-target effects in these experiments have not been systematically evaluated. Here, we conducted multiple genome-scale CRISPR screens for essential CTCF loop anchors in the human K562 erythroid cell line. Surprisingly, the primary drivers of apparent ``hits'' in this screen were single guide RNAs (sgRNAs) with low sequence specificity. After removing these confounders, we found that no CTCF loop anchors among the ones we screened are essential for cell growth in culture. We also observed analogous effects in independent non-coding screens densely tiling regulatory elements and genomic neighborhoods near previously known essential genes. Strikingly, we found that low-specificity guides also result in strong confounding growth effects in screens employing epigenetic perturbations that do not cause DNA damage, such as CRISPRi and CRISPRa. Remarkably, the set of confounded guides is distinct for each perturbation mode. Promisingly, strict filtering of CRISPRi libraries using GuideScan-aggregate specificity scores removed these confounded sgRNAs and allowed for the identification of essential enhancers, which we validated extensively. Our stduy presents the first genome-scale functional characterization of CTCF binding sites in the human genome, while also identifying the limitations on and outlining the future prospects for the detailed functional dissection of regulatory elements in the genome using Cas9.
Overall design
This series contains dataset of several different types: 1) pooled CRISPR screens of non-coding genomic elements (tiling screens, screens targeting CTCF loop anchor sites, "fine-mapping" screens for insulator and enhancer elements) in the K562 cell line and 2) ATAC-seq, RNA-seq, ChIP-seq experiments examining the changes in gene expression, chromatin accessibility, and protein occupancy in a set of validation cell lines generated using individual sgRNAs)