NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Platform GPL11273 Query DataSets for GPL11273
Status Public on Sep 13, 2011
Title McIntyre D. simulans 2m v1.0
Technology type in situ oligonucleotide
Distribution custom-commercial
Organism Drosophila simulans
Manufacturer Affymetrix
Manufacture protocol The chip has approximately 2.5 million features, covering 4 types of probe sets: custom SNP probe sets, 3’ expression probe sets s from the Affymetrix Drosophila expression array 2.0 (all perfect-match probes), selected probes from the Affymetrix Drosophila tiling array version 2.0 (exonic perfect-match probes), and standard Affymetrix negative control probe sets. Custom SNP probe sets were designed to for genotyping and ASE assays. Each SNP probe set includes 24 probes corresponding to four bases at the SNP site, both forward and reverse strands, and three positions of the SNP relative to the central base of the probe.
For SNP identification, alignment sets were created from multiple sequence sources, including FlyBase R5.4 exons (68,536), 6 DPGP D. simulans strain genomes (assembled against the R4.2 D. melanogaster genome), and all Genbank D. simulans sequences (343,420) that were not annotated as “whole genome”. For each R5.4 exon the genome location in R4.2 was determined by BLAST alignment to the D. melanogaster R4.2 genome. The location was determined as where the longest alignment was found for multihit exons. Exons that mapped to more than one location or to chromosomes 4 or U were excluded. The remaining unique exons were BLAST (ref) aligned to the DPGP genomes and Genbank sequences. All sequences were then aligned using ClustalW(cite), creating a multiple sequence alignment for each exon at its genome position, from which SNPs were identified. SNPs were selected by quality criteria using the information from their supportive sequences. If there were fewer than 5 sequences supporting this SNP, or if more than 1 SNP occurred in the design window makeing the alignment suspected, the SNP was discarded. Genome position was not considered for SNP selection. The selected SNPs could be located at any position of an exon. There were 566 exons where SNP data was identified from Genbank alone, and 52,188 exons for which SNP data was identified from DPGP alone. Only 357 exons had no identified SNPs in either Genbank or DPGP.
At each nucleotide where a D. simulans SNP was present a 35 bp design window was created- 17 bases upstream and 17 bp downstream from the SNP. There were 589,915 design windows created, corresponding to 13,637 genes. Design windows were compared to the D. melanogaster genome (v 4.7 as this is the assembly used for the DPGP data) using BLAST. There were 563,558 design windows found to be unique to the genome. If there were multiple SNPs in the design window, or if the SNPs identified were not biallelic those SNPS were discarded. After this initial selection, there were 196,345 biallelic SNPs in unique design windows with no other SNPs. For each design window 24 probes were constructed. For each SNP, probes for all 4 bases were designed12 each for the forward and reverse strands, with the SNP at the 0, +4, and -4 position. Probe sets were eliminated if any probe in the probe set contained a homopolymer run or could not be synthesized. The remaining 189,946 probe sets were examined for hybridization quality predicted using an Affymetrix internal scoring algorithm that takes into account, Tm, secondary structure and previous empirical observations. Probe sets were eliminated if 1/3 or more of the probes had poor predicted hybridization. For all genes with 7 or fewer SNPs, all SNPs were selected. Finally, if a gene had more than 7 SNPs with many of them being high coverage (more than 4 lines supporting the SNP), the probe sets with the best predicted hybridization were selected. This led to a total of 61,142 probe sets selected. To fill the chip an additional 610 were selected at random from the set of genes with more than 7 SNPs per gene for a total of 61,752 probe sets. Overall, SNP probe sets were designed for 11,946 genes (R5.26, 12,385 genes of R5.4 at the time of chip design), covering 87% of the known transcriptome.
Also printed on the chip were all the positive match (PM) probes on the 3’ IVT Affymetrix Drosophila 2.0 array (n=262,766); all tiling probes in exons (FlyBase 5.11 n=699,856) from the Affymetrix Drosophila 2.0 tiling array; GC band controls (n=16,943); and hybridization and labeling controls (n=1,644). Overall, a total of 2,463,630 probes are on the chip. The probes from the 3’ IVT array allow for clear gene level detection calls and will be referred as “3’ expression probes” in the following text. The tiling probes provide controls for measurement of signal fluctuation caused by 5’ bias in expression assays as well as measurement of alternative exon usage, and they will be referred as “exon probes” hereby.
 
 
Contributor(s) Yang Y, Graze RM, Waltz BM, Nuzhdin S, Wayne ML, McIntyre LM
Submission date Dec 02, 2010
Last update date Sep 13, 2011
Contact name Yajie Yang
E-mail(s) yyj920@gmail.com
Organization name UF
Street address 2033 Mowry Road Room 118
City Gainesville
State/province FL
ZIP/Postal code 32610
Country USA
 
Samples (15) GSM787984, GSM787985, GSM787986, GSM787987, GSM787988, GSM787989 
Series (1)
GSE31750 Verification of a custom chip which measures abundance, isoforms and alleles in Drosophila

Data table header descriptions
ID
probesetID
fbgn FlyBase Gene name
chromosome Chromosome
genome_position Genome position
offset The offset position from the central SNP base for the SNP module probes
snpbase The four possible bases A/C/G/T used for genotyping
strand The strand that the SNP probe was designed at
SEQUENCE The probe sequence for each probe on the chip
module There are SNP module, exon module, expression module and controls
symbol Gene symbol name
SPOT_ID

Data table
ID probesetID fbgn chromosome genome_position offset snpbase strand SEQUENCE module symbol SPOT_ID
1616608_pm_a_0 1616608_pm_a FBgn0001128 2L 5947134 . . . ACTGCCGAGGAGGTCAACTACATGC ivt Gpdh
1616608_pm_a_1 1616608_pm_a FBgn0001128 2L 5947137 . . . GCCGAGGAGGTCAACTACATGCTGA ivt Gpdh
1616608_pm_a_10 1616608_pm_a FBgn0001128 2L 5947291 . . . TCAGCTCAAGCCTAATGATTTAATT ivt Gpdh
1616608_pm_a_11 1616608_pm_a FBgn0001128 2L 5947305 . . . ATGATTTAATTGATTGCATACGCAA ivt Gpdh
1616608_pm_a_12 1616608_pm_a FBgn0001128 2L 5947317 . . . ATTGCATACGCAATCACCCTGAGCA ivt Gpdh
1616608_pm_a_13 1616608_pm_a FBgn0001128 2L 5947321 . . . CATACGCAATCACCCTGAGCATATG ivt Gpdh
1616608_pm_a_2 1616608_pm_a FBgn0001128 2L 5947142 . . . GGAGGTCAACTACATGCTGAAGAAC ivt Gpdh
1616608_pm_a_3 1616608_pm_a FBgn0001128 2L 5947148 . . . CAACTACATGCTGAAGAACAAGGGT ivt Gpdh
1616608_pm_a_4 1616608_pm_a . . . . . . GAACAAGGGTCTGGAGGACAAATTC ivt . --unknown
1616608_pm_a_5 1616608_pm_a FBgn0001128 2L 5947257 . . . CCCTGTTCACGGCTATTCACAAAAT ivt Gpdh
1616608_pm_a_6 1616608_pm_a FBgn0001128 2L 5947261 . . . GTTCACGGCTATTCACAAAATATGC ivt Gpdh
1616608_pm_a_7 1616608_pm_a FBgn0001128 2L 5947265 . . . ACGGCTATTCACAAAATATGCACAA ivt Gpdh
1616608_pm_a_8 1616608_pm_a FBgn0001128 2L 5947280 . . . ATATGCACAAATCAGCTCAAGCCTA ivt Gpdh
1616608_pm_a_9 1616608_pm_a FBgn0001128 2L 5947284 . . . GCACAAATCAGCTCAAGCCTAATGA ivt Gpdh
1622892_pm_s_0 1622892_pm_s FBgn0035889 3L 8395981 . . . GATGTGCTAACTCAACTGTATGCTT ivt mkg-p
1622892_pm_s_1 1622892_pm_s FBgn0035889 3L 8395935 . . . CGAAGGAGCCGAAACACTACAAGTG ivt mkg-p
1622892_pm_s_10 1622892_pm_s FBgn0035889 3L 8395701 . . . AAGGAGATCTCCTCGGACTGCAGCG ivt mkg-p
1622892_pm_s_11 1622892_pm_s FBgn0035889 3L 8395677 . . . GCAATAGTCCGCTAACAAGGCTTTT ivt mkg-p
1622892_pm_s_12 1622892_pm_s FBgn0035889 3L 8395649 . . . GGCGTTTAAGAATCTCCTGGGCAAC ivt mkg-p
1622892_pm_s_13 1622892_pm_s FBgn0035889 3L 8395603 . . . AGCACCTGGAAGTTTTGCGCCACAA ivt mkg-p

Total number of rows: 2463424

Table truncated, full table size 216252 Kbytes.




Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp

Supplementary file Size Download File type/resource
GPL11273_dros_snpa520726F.CDF.gz 14.4 Mb (ftp)(http) CDF

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap