Sequence variation in regulatory DNA alters gene expression and shapes genetically complex traits. However, the identification of individual, causal regulatory variants is challenging. Here, we used a massively parallel reporter assay to measure the cis-regulatory consequences of 5,832 natural DNA variants in the promoters of 2,503 genes in the yeast Saccharomyces cerevisiae. We identified 451 causal variants, which underlie genetic loci known to affect gene expression. Several promoters harbored multiple causal variants. In five promoters, pairs of variants showed non-additive, epistatic interactions. Causal variants were enriched at conserved nucleotides, tended to have low derived allele frequency, and were depleted from promoters of essential genes, consistent with the action of negative selection. Causal variants were enriched for alterations in transcription factor binding sites. Models integrating these features provided modest, but statistically significant, ability to predict causal variants. This work revealed a complex molecular basis for cis-acting regulatory variation.
Overall design: This study used a massively parallel reporter assay to study cis-effects of yeast promoter variants on gene expression. This repository provides the experimental measurements. The project used designed, synthetic DNA oligos. Each oligo was tagged with multiple, randomly assigned DNA barcodes. There were two sub-libraries ("TSS" and "Upstream"). Each of these sub-libraries had multiple replicates from separate growth cultures. For each replicate, we sequenced barcodes from cDNA (to measure expression driven by the given oligo; indicated here by a “treatment” called “RNA”) and plasmid DNA (to normalize for unequal library composition; treatment “DNA”). In the TSS library, two samples (TSS_2016_A1 and TSS_2016_A2) are technical replicates, as are TSS_2016_B1 and TSS_2016_B2. Each sub-library has two processed data files.
Less...