DIAS_PIG_55K3_v1 consists of 55.548 features: 26.877 PCR products from cDNA clones and 867 control features spotted as neighbor duplicates in a 4 x 12 array with sub-arrays each consisting of 34 x 34 spots. The DNA fragments were amplified from cDNA clones selected from the cDNA libraries generated in the Sino-Danish Pig Genome Sequencing Consortium (Gorodkin et al.: Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 ESTs. Genome Biol 2007, 8:R45). The cDNA clones were selected to represent the largest possible number of human gene transcripts. EST clusters for human gene transcripts in NCBI’s RefSeq database release 17 were created using BLASTN sequence similarity program implemented to run on a DeCypher computer [http://www.timelogic.com] with P-value at or below 10-8. Within each cluster one cDNA with the minimum predicted distance to the 3’ end of the human gene transcript was selected. Microarray cDNAs were mapped to GO terms and KEGG pathways based on the human accession id, obtained from BLASTN, using the AnnBuilder package from Bioconductor. To represent uncharacterized genes on the microarray, a set of EST sequence clusters without BLASTN sequence similarity to any known human gene transcript was created and clustered using the “all-vs-all” TERACLU algorithm on a DeCypher computer from TimeLogic [http://www.timelogic.com]. We added one cDNA to the selection list for each of EST clusters with a minimum depth of 3 ESTs and minimum predicted distance to the 3’ end of the assembled EST contig. Of the 26.877 cDNAs, 21.417 map to 15.831 human gene transcript IDs corresponding to roughly 1.35 cDNAs per gene transcript. The remaining 5.460 cDNAs were thus estimated to cover around 4.036 gene transcripts yielding approximately 19.867 gene transcripts in total. The selected clones were rearrayed from the libraries using a QPixII (Genetix), amplified by PCR and purified using the 384 well Microarray PCR Purification Kit (Telechem International). The PCR products were evaluated by agarose gels and dried by speed vac and resuspended in spotting solution. The control features consists of positive spots (also known as “landing lights”), blanks (spotting buffer) and selected spots from the Lucidea Universal ScoreCard (GE Healthcare) which contains both intensity dilution series and different ratios. The positive spot was prepared by PCR amplification of a bovine EST fragment with a primer pair where one of the primers was labeled with Alexa-488 or Alexa-594. The arrays were spotted on UltraGAPS slides (Corning Incorporated) at 20 C/55 % RH using an SDDC-2 MicroArrayer (BioRad) with 48 x 946 MP2.5 pins (Telechem International) with DNA resuspended in Micro Spotting Solution Plus (Telechem International). Slides were dried and UV cross-linked (300 mJ) and stored in a vacuum desiccator until use.
The sequences of the cDNA clones can be found in the attached file "GPL3608_array_cDNA_28k_2005-07-13.cDNA.scr.zip". The attached .gal file contains the original annotation, but an updated version of the annotations have been obtained by blast against mammalian refseq (version 34, file: "GPL3608_p55k-vs-mam-refseq34-2009-04-17.txt") and against Human refseq (release 2009-03-09, file: "GPL3608_p55k-vs-hsa-refseq-2009-04-17.txt") (Rank = 1, Minimum alignment length = 50, Minimum percent alignment = 85 %). An annotation package (file: "GPL3608_p55k3Ver2HsRs20090309AL50PA85_1.0.zip") has furthermore been built in R (version 2.8.0) using AnnBuilder (version 1.20.0).