|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Sep 11, 2023 |
Title |
P053 |
Sample type |
SRA |
|
|
Source name |
3rd and 4th rosette leaves
|
Organism |
Arabidopsis thaliana |
Characteristics |
tissue: 3rd and 4th rosette leaves genotype: Ws-2
|
Growth protocol |
The Arabidopsis thaliana ecotype Wassilewskija Ws-2 seeds were surface sterilized and then grown in MS-agar, 1% sucrose plates. These were cold treated at 4C in the dark for 4 days before being grown in 16h light/ 8h dark cycles at a constant temperature of 21C. Seedlings were transferred to soil (with 5% sand) after 10 days and the 3rd and 4th true rosette leaves were tracked for future reference. After 10 days in soil conditions, 75 plants were selected from the total of 225 plants grown, with plants selected to ensure a diversity of bolting statuses. At ZT4 (4h after lights on) the following day, the 3rd and 4th rosette leaves for each individual plant were pooled and flash frozen in liquid nitrogen.
|
Extracted molecule |
polyA RNA |
Extraction protocol |
Total RNA was isolated from these samples using the Qiagen RNeasy Plant Mini Kit (Cat no. 74904). Residual genomic DNA was removed using the Invitrogen Turbo DNA-free kit (Cat no. AM1907), according to the manufacturer’s protocol. Libraries were prepared with the NEBNext Ultra II Directional Library Prep Kit for Illumina (Cat no. E7765), using the NEBNext poly(A) magnetic isolation module (Cat no. E7490). Quality control was performed with the Agilent 2100 Bioanalyzer instrument (Part no. G2939BA). Finally, a total of 70 libraries were pooled and sequenced, via Novagene, using one lane on an Illumina NovaSeq system.
|
|
|
Library strategy |
RNA-Seq |
Library source |
transcriptomic |
Library selection |
cDNA |
Instrument model |
Illumina NovaSeq 6000 |
|
|
Description |
4 days of stratification at 4C; 10 days grown on plates (16h light / 8h dark); 10 days grown in soil (same photoperiod); then sampled
|
Data processing |
Before analysis of the raw sequencing data, FastQC v0.11.7(Andrews et al. 2012) was used to assess read quality. Illumina adapters were trimmed using CutAdapt v3.4 (Martin 2011). (parameters: -a AGATCGGAAGAG -A AGATCGGAAGAG -j 0 -q 10,10 -m 30 -u 10 -U 10) Reads were quantified using Salmon v1.6.0 (Patro et al. 2017) and the TAIR10 transcriptome (Berardini et al. 2015). (parameters: -l A --validateMappings) To analyse genetic variation within the population, we followed GATK guidelines for short variant discovery from RNA-seq data. We used STAR (v2.7.10b; two-pass mode) to align reads to TAIR10 genome, revision 56 (Dobin et al. 2013). (parameters: --runThreadN 20 --readFilesIn ${file}_1.fq.gz ${file}_2.fq.gz --readFilesCommand "gunzip -c" --outSAMtype BAM SortedByCoordinate --limitBAMsortRAM 5000000000 --twopassMode Basic --outFilterScoreMinOverLread 0 --outFilterMatchNminOverLread 0 --outFilterMatchNmin 0 --outFilterMismatchNmax 1) We then pre-processed the aligned reads using the commands ‘MarkDuplicates’ (parameters: gatk --java-options '-Xmx28G' MarkDuplicates -ASO coordinate --VERBOSITY DEBUG --MAX_RECORDS_IN_RAM 5000000) and ‘SplitNCigarReads’ (parameters: gatk --java-options '-Xmx28G' SplitNCigarReads --verbosity DEBUG) from GATK (v4.3.0.0) (Poplin et al. 2018). We used the command ‘HaplotypeCaller’ (parameters: -ERC GVCF --dont-use-soft-clipped-bases true --standard-min-confidence-threshold-for-calling 20) to produce genomic variant calling format (gVCF) files per sample. Finally, we combined GVCF files using 'CombineGVCFs' (parameters: gatk --java-options '-Xmx60G' CombineGVCFs) produced identified single-nucleotide variants (SNVs) and insertions / deletions (indels) which were confidently called across the whole population, using ‘GenotypeGVCFs’ (parameters: gatk --java-options "-Xmx60g" GenotypeGVCFs). For further processing of these initial SNVs and indels, we followed the filtering guidelines suggested by ref. (Cruz et al. 2020). Specifically, we selected only biallelic variants with a minimum genotype quality of 40, and which were called in at least 80% of all samples, using VCFtools (v0.1.16) (Danecek et al. 2011). (parameters: --minGQ 40 --max-missing 0.8 --out joint_genotyping/all_genotyped_filtered.vcf --recode) We then imputed missing genotypes using Beagle (v5.4, 22Jul22, 46e) on default settings (Browning, Zhou, and Browning 2018). Note, due to pre-processing by Beagle, the variant calling format (VCF) file includes two versions of the same heterozygous haplotype (‘0|1’ and ‘1|0’). (parameters: java -Xmx10240m -jar beagle.22Jul22.46e.jar nthreads=2) Finally, we selected variants with a minor allele frequency (MAF) of at least 0.05, using VCFtools. (vcftools --maf 0.05 --recode) Assembly: TAIR10 Supplementary files format and content: Salmon quantification folder for each sample, included as a .tar file Supplementary files format and content: A final VCF file for all samples
|
|
|
Submission date |
Sep 07, 2023 |
Last update date |
Sep 11, 2023 |
Contact name |
Ethan James Redmond |
E-mail(s) |
ethan.redmond@york.ac.uk
|
Organization name |
University of York
|
Department |
Department of Biology
|
Lab |
Ezer lab
|
Street address |
Wentworth Way
|
City |
York |
ZIP/Postal code |
YO10 5DD |
Country |
United Kingdom |
|
|
Platform ID |
GPL26208 |
Series (1) |
GSE242681 |
Single-plant-omics reveals the cascade of transcriptional changes during the vegetative-to-reproductive transition |
|
Relations |
BioSample |
SAMN37318769 |
SRA |
SRX21665197 |
Supplementary file |
Size |
Download |
File type/resource |
GSM7766786_P053_quant.tar.gz |
616.7 Kb |
(ftp)(http) |
TAR |
SRA Run Selector |
Raw data are available in SRA |
Processed data provided as supplementary file |
|
|
|
|
|