GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM3304660

Query DataSets for GSM3304660

Status

Public on Dec 13, 2018

Title

dictionary sequencing

Sample type

SRA

Source name

synthetic

Organism

synthetic construct

Characteristics

barcode: ACAGTG

Extracted molecule

genomic DNA

Extraction protocol

pri-miRNA variants containing 29–31-nucleotide barcodes and random sequences at two junctional positions were prepared by oligonucleotide annealing, extension, and PAGE gel purification. After quantification of the initial pools using NEBNext library quantitation kit (NEB), 16,000,000 molecules per pri-miRNA group were reamplified by PCR to decrease barcode complexity. An aliquot from the amplified DNA was sequenced as "dictionary". Next, T7 in vitro transcription was used to produce pri-miRNAs from the above DNA pools, and a subset of the RNA was sequentially treated with alkaline phosphatase and T4 polynucleotide kinase to be traced using gamma ATP. An aliquot from the transcribed RNA was stored as "input". In vitro DROSHA processing was carried out to the "input" pri-miRNAs, and 5′ cleavage fragments were purified and ligated with a 3′ adapter which contains 4-nucleotide random sequences. Next, reverse transcription using SuperScript III (Thermo Fisher) was carried out on "cleavage fragment" as well as "input" RNAs. After amplification using PCR, a 6% native PAGE gel was used to purify DNA libraries.
Illumina adapter sequences were incorporated to the libraries using PCR.

Library strategy

OTHER

Library source

genomic

Library selection

other

Instrument model

Illumina HiSeq 2500

Description

degenerate DNA library made by PCR

Data processing

The base calling was done by RTA (Real Time Analysis. v1.18)
For the "dictionary" which contains the barcode and variant pair information, fastq-join was first used to join the paired-end reads (maximum difference: 5%; -p 5). Then, after filtering reads through fastq_quality_filter (minimum quality score: 30, minimum percent of bases having the quality: 90; -q 30 -p 90), cutadapt was used tandemly two times (default parameters); first, for the trimming of the 3′ flanking sequence; second, for the split of the barcode and variants. After the initial filter, sequencing error-containing reads were discarded, and the unique barcode-variant pairs were reported as "processed dictionary file". Note that this file contains sequences that were not initially designed variants; i.e., errors during oligo synthesis, PCR, or sequencing.
For the "input" library, first, cutadapt was used to trim the adapter (minimum trimmed read length: 25, maximum trimmed read length: 35; -m 25 -M 35), and then fastq_quality_filter was applied (minimum quality score: 30, minimum percent of bases having the quality: 100; -q 30 -p 100). The "processed input file" contains count numbers per each barcode.
For the "cleavage" library, after trimming the 3p adapter using cutadapt (minimum trimmed read length: 30; -m 30), Fastq_quality_filter was used (minimum quality score: 30, minimum percent of bases having the quality: 95; -q 30 -p 95). Then, because of the random tetramer from the 3p adapter, 4 nucleotides from the 3′-end of reads were trimmed by fastx_trimmer (-t 4), and the barcode-variants pairs were splitted by using cutadapt. Subsequently, sequencing error-containing reads were discarded. The "processed cleavage fragment file" contains barcodes and 5′ cleavage fragments.
For the "variant_counts", only initially designed pri-miRNA variants were considered and remained.
Genome_build: n/a
Supplementary_files_format_and_content: Provided are tab-delimited text files which contains "barcode-variant pairs" for dictionary, "barcode-count pairs" for input, and "barcode-fragment pairs" for cleavage files. "variant_counts.txt" contains all merged information derived from initially designed pri-miRNA sequences.
Supplementary_files_format_and_content: dictionary.txt: barcode-variant pairs
Supplementary_files_format_and_content: input_barcode_counts.txt: barcode-input count pairs
Supplementary_files_format_and_content: barcode_cleavage_frag.txt: barcode-cleavage fragment pairs
Supplementary_files_format_and_content: designed_variants.txt: initially designed pri-miRNA variants
Supplementary_files_format_and_content: variant_counts.txt: merged information of variant-input count-cleavage fragment

Submission date

Jul 24, 2018

Last update date

Dec 13, 2018

Contact name

S. Chul Kwon

E-mail(s)

chul@hku.hk

Organization name

The University of Hong Kong

Department

School of Biomedical Sciences

Street address

L1-05, Laboratory Block, 21 Sassoon Road, Pokfulam

City

Hong Kong

State/province

ZIP/Postal code

Country

Hong Kong

Platform ID

GPL19604

Series (1)

GSE117600

High-throughput in vitro DROSHA processing on 38,880 pri-miRNA variants

Relations

BioSample

SAMN09710185

SRA

SRX4453288

Supplementary file	Size	Download	File type/resource
GSM3304660_dictionary.txt.gz	140.4 Mb	(ftp)(http)	TXT
SRA Run Selector
Raw data are available in SRA
Processed data provided as supplementary file
Processed data are available on Series record