Young stems were collected and frozen in liquid nitrogen from a wild E. arvense population growing in Ithaca, NY. RNA was extracted (Wan & Wilkins, 1994) and 500 ng of total RNA was selected and amplified using the TargetAmp aRNA amplification kit (Epicentre, Madison, WI, U.S.A.). The product was purified using the RNeasy Mini Kit (Qiagen, Germantown, MD, U.S.A.) and cDNA libraries were made using the SuperScript™ Choice System (Life Technologies, Carlsbad, CA, U.S.A.), with a mix of polyT and random hexamer DNA primers for first strand synthesis and only random hexamers for the second strand. cDNAs were purified using the PureLinkPCR Purification Kit (Life Technologies, Carlsbad, CA, U.S.A.) and libraries were prepared for 454 pyrosequencing using a 454 Genome Sequencer FLX system with titanium chemistry, according to manufacturer’s instructions (Roche Diagnostics, Indianapolis, IN, U.S.A.) and then sequenced at the Cornell University BRC DNA sequencing facility (http://cores.lifesciences.cornell.edu/brcinfo/?f=1). The raw sequence files in SFF format were base called using the Pyrobayes base caller (Quinlan et al., 2008). The sequences were then processed to remove low quality regions and adaptor sequences using the programs LUCY (Chou & Holmes, 2001) and SeqClean (https://github.com/gentoo-science/sci/blob/master/sci-biology/seqclean/seqclean-110625.ebuild). The resulting high quality sequences were then screened against the NCBI UniVec database and E. coli genome sequences to remove possible contamination. Sequences shorter than 30 base pairs were discarded. The processed high-quality sequences were assembled de novo using iAssembler (Zheng et al., 2011). After assembly, the unigenes were annotated by BLAST searches against GenBank (/genbank) non-redundant protein (nr) with a cut off e value of 1e-5.
Less...