Sample GSM3304661 Query DataSets for GSM3304661
Status Public on Dec 13, 2018
Title input sequencing
Sample type SRA
Source name synthetic
Organism synthetic construct
Characteristics barcode: AGTCAA
Extracted molecule genomic DNA
Extraction protocol pri-miRNA variants containing 29–31-nucleotide barcodes and random sequences at two junctional positions were prepared by oligonucleotide annealing, extension, and PAGE gel purification. After quantification of the initial pools using NEBNext library quantitation kit (NEB), 16,000,000 molecules per pri-miRNA group were reamplified by PCR to decrease barcode complexity. An aliquot from the amplified DNA was sequenced as "dictionary". Next, T7 in vitro transcription was used to produce pri-miRNAs from the above DNA pools, and a subset of the RNA was sequentially treated with alkaline phosphatase and T4 polynucleotide kinase to be traced using gamma ATP. An aliquot from the transcribed RNA was stored as "input". In vitro DROSHA processing was carried out to the "input" pri-miRNAs, and 5′ cleavage fragments were purified and ligated with a 3′ adapter which contains 4-nucleotide random sequences. Next, reverse transcription using SuperScript III (Thermo Fisher) was carried out on "cleavage fragment" as well as "input" RNAs. After amplification using PCR, a 6% native PAGE gel was used to purify DNA libraries.
Illumina adapter sequences were incorporated to the libraries using PCR.
Library strategy OTHER
Library source genomic
Library selection other
Instrument model Illumina HiSeq 2500
Description degenerate DNA library made by PCR
Data processing The base calling was done by RTA (Real Time Analysis. v1.18)
For the "dictionary" which contains the barcode and variant pair information, fastq-join was first used to join the paired-end reads (maximum difference: 5%; -p 5). Then, after filtering reads through fastq_quality_filter (minimum quality score: 30, minimum percent of bases having the quality: 90; -q 30 -p 90), cutadapt was used tandemly two times (default parameters); first, for the trimming of the 3′ flanking sequence; second, for the split of the barcode and variants. After the initial filter, sequencing error-containing reads were discarded, and the unique barcode-variant pairs were reported as "processed dictionary file". Note that this file contains sequences that were not initially designed variants; i.e., errors during oligo synthesis, PCR, or sequencing.
For the "input" library, first, cutadapt was used to trim the adapter (minimum trimmed read length: 25, maximum trimmed read length: 35; -m 25 -M 35), and then fastq_quality_filter was applied (minimum quality score: 30, minimum percent of bases having the quality: 100; -q 30 -p 100). The "processed input file" contains count numbers per each barcode.
For the "cleavage" library, after trimming the 3p adapter using cutadapt (minimum trimmed read length: 30; -m 30), Fastq_quality_filter was used (minimum quality score: 30, minimum percent of bases having the quality: 95; -q 30 -p 95). Then, because of the random tetramer from the 3p adapter, 4 nucleotides from the 3′-end of reads were trimmed by fastx_trimmer (-t 4), and the barcode-variants pairs were splitted by using cutadapt. Subsequently, sequencing error-containing reads were discarded. The "processed cleavage fragment file" contains barcodes and 5′ cleavage fragments.
For the "variant_counts", only initially designed pri-miRNA variants were considered and remained.
Genome_build: n/a
Supplementary_files_format_and_content: Provided are tab-delimited text files which contains "barcode-variant pairs" for dictionary, "barcode-count pairs" for input, and "barcode-fragment pairs" for cleavage files. "variant_counts.txt" contains all merged information derived from initially designed pri-miRNA sequences.
Supplementary_files_format_and_content: dictionary.txt: barcode-variant pairs
Supplementary_files_format_and_content: input_barcode_counts.txt: barcode-input count pairs
Supplementary_files_format_and_content: barcode_cleavage_frag.txt: barcode-cleavage fragment pairs
Supplementary_files_format_and_content: designed_variants.txt: initially designed pri-miRNA variants
Supplementary_files_format_and_content: variant_counts.txt: merged information of variant-input count-cleavage fragment
Submission date Jul 24, 2018
Last update date Dec 13, 2018
Contact name S. Chul Kwon
Organization name Institute for Basic Science
Street address 1 Gwanak-ro
City Seoul
ZIP/Postal code 08826
Country South Korea
Platform ID GPL19604
Series (1)
GSE117600 High-throughput in vitro DROSHA processing on 38,880 pri-miRNA variants
BioSample SAMN09710184
SRA SRX4453289

Supplementary file Size Download File type/resource
GSM3304661_input_barcode_counts.txt.gz 88.6 Mb (ftp)(http) TXT
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file
Processed data are available on Series record

