NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM2086281 Query DataSets for GSM2086281
Status Public on Mar 12, 2016
Title control_run56
Sample type SRA
 
Source name 10 day old seedling
Organism Arabidopsis thaliana
Characteristics strain: Col-0
tissue: seedling
chip antibody: F3165 Sigma Monoclonal ANTI-FLAG M2
Growth protocol liquid culture
Extracted molecule genomic DNA
Extraction protocol Chromatin‐immunoprecipitation experiments were carried out as described by Kwon et al. (Kwon et al, 2005), except that anti‐FLAG M2 magnetic beads (Sigma) were used and immunoprecipitations were only performed for two hours.
The ChIP‐seq libraries and the Illumina was performed as described by Yant et al. (Yant et al, 2010).
 
Library strategy ChIP-Seq
Library source genomic
Library selection ChIP
Instrument model Illumina Genome Analyzer II
 
Data processing Standard Illumina base calling software was used to base call the 40-42 nucleotide sequence reads. We used SHORE (http://shore.sourceforge.net) as a platform for further analysis. The obtained reads were quality filtered and low quality bases at the 3’ end were pruned as described. GenomeMapper (http://1001genomes.org/software/) was used for mapping to the TAIR9 genome, allowing for up to four mismatching nucleotides and no gaps. To proceed, the mapped data were subjected to a heuristic for removal of duplicate sequence reads which were assumed to be uninformative for the detection of enriched loci. A threshold was applied limiting the number of 5’ ends mapping to the same position on the same strand. To retain the power to discriminate between multiple strongly enriched regions, the threshold for any particular position was varied depending on the coverage in close vicinity, such that the variance of the number of reads per position would roughly equal its mean in a 30 bp sliding window. We further applied a two-step procedure to identify regions significantly enriched in the positive sample when compared to the control. First, potentially enriched regions were identified based on the positive samples only. These sites were then directly compared to the corresponding control sample regions to assess statistical significance. For estimation of the depth of coverage for each position in the genome, all positive sample reads mapping to unique positions were extended in 3’ direction to 130 bp, corresponding to half the experimentally observed approximate DNA fragment size, while discarding all other reads. To detect possible peak sites, a 2 kb wide sliding window was applied to the coverage graph in single base steps. In each step a p-value was assigned to the coverage value at the central base using a one-sided Poisson test, with the distribution parameter set to the average coverage within the sliding window. Only positions with coverage greater than zero were included in the calculation of the average, assuming all other positions to be inaccessible to the experiment. Finally, any consecutive stretch of positions with p-value <0.05 and length >130 bp was retained as a potentially enriched site. To further reduce the number of regions to be considered, each was checked for unwarranted high average coverage in the control sample. A potential peak in the positive sample was discarded if the coverage mean in the control sample in the corresponding region was larger than the median average control coverage plus a tolerance of three standard deviations in all peak regions. For assignment of final p-values to each candidate region, in each replicate a one-sided binomial test was applied to the number of reads mapping to the region in the positive sample, with the distribution parameter N set to the joint read count for the site for the positive and the corresponding control sample. To estimate the probability parameter for the test, from now on called r, we computed a scaling factor s for the control sample and the chromosome containing the considered region. The complete chromosome sequence was subdivided into 400 base pair bins, and for each bin the positive sample as well as the control sample read counts were recorded. Then s was chosen such that the median ChIP sample read count for all bins equaled the median control sample read count multiplied by s. From this the binomial test parameter r was calculated as r=s/(s+1). Finally, false discovery rates were obtained through the Benjamini-Hochberg correction method. To establish a ranking of peak regions across replicates, the rank product over the per-replicate fdr ranks was used. To define the bound DNA set, we used all peaks with fdr > 0.1 in both ChIP experiments, resulting in a set of 1564 peaks.
Genome_build: TAIR9
Supplementary_files_format_and_content: text file with peak positions and signal intensity for both rev and control samples.
 
Submission date Mar 11, 2016
Last update date May 15, 2019
Contact name Stephan Wenkel
E-mail(s) wenkel@plen.ku.dk
Organization name University of Copenhagen
Street address Thorvaldsensvej 40
City Frederiksberg C
ZIP/Postal code 1871
Country Denmark
 
Platform ID GPL9302
Series (1)
GSE26722 Role of class III HD‐ZIP transcription factors in shade signaling
Relations
BioSample SAMN04546530
SRA SRX1629223

Supplementary data files not provided
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap